COMBINATIVE  RANK-BASED  TESTS  FOR  COMPARING  RESPONSE  RATES 
AND  RESPONSE  DURATIONS  IN  RANDOMIZED  CLINICAL  TRIALS 


By 

ROBIN  MUKHERJEE 


A  DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


1999 


©  Copyright  1999 
by 

Robin  Mukherjee 


To  my  parents, 
my  sisters, 
the  Late  Mrs.  Annette  Kendall  Katzoff 
and  all  those  afflicted 
by  Alzhiemer's  Disease  or  other  kinds  of  dementia 


ACKNOWLEDGEMENTS 


I  would  like  to  express  my  sincere  gratitude  to  Dr.  Myron  Chang,  without  whom 
this  work  would  never  have  been  completed.  During  all  those  times  when  I  was 
dejected  or  frustrated,  Dr.  Chang  has  certainly  played  the  role  of  the  guardian  that 
one  can  always  use  when  far  away  from  home. 

It  is  also  worthy  of  note  that  during  my  job  interviews  my  topic  of  research  received 
great  praise,  owing  to  its  direct  application  in  randomized  clinical  drug  trials,  from 
some  of  the  world's  most  renowned  pharmaceutical  companies.  I  especially  thank  Dr. 
Chang  for  his  wisdom  and  foresight. 

If  it  weren't  for  Dr.  Pam  Ohman,  it  would  be  hard  for  me  to  make  any  impact 
during  my  various  oral  presentations.  Of  course,  with  pleasure,  I  appreciate  the 
graciousness  of  my  good  friends  Ralitza  Gueorguieva,  Fe  Lorica,  and  Chen-Pin  Wang 
for  making  invaluable  comments  and  suggestions  in  this  endeavor. 

I  would  also  like  to  thank  Dr.  Jon  Shuster,  Dr.  Jim  Kepner,  and  Dr.  Chunrong 
Ai  for  providing  insight  during  the  course  of  my  Ph.D.  research  and  being  part  of  my 
Ph.D.  supervisory  committee.  I  sincerely  hope  they  do  realize  the  enormous  impact 
that  their  collective  experience  and  knowledge  has  made  on  me.  Special  thanks  go 
to  Dr.  P.V.  Rao,  Dr.  Ron  Randies,  and  Carol  Rozear  for  being  there  when  I  needed 
someone  to  talk  to.  I  will  always  miss  their  presence  in  my  daily  life. 

I  am  especially  grateful  to  my  friends  Ginger  Boucher,  Eric  Diamond,  Bill  Lassiter, 
Bob  Pastorello,  Santosh  Kamath,  Allan  Greenstein,  David  IngersoU,  Chris  Stetter, 
and  Gary  Wyder,  who  have  made  a  significant  impact  on  my  development  as  a  human 
being. 


iv 


I  am  very  grateful  to  my  precious  friend  Bonnie  LaFleur  who  has  been  such  a 
great  source  of  inspiration  and  support  during  some  of  the  roughest  of  times;  while 
working  toward  completion  of  a  Ph.D.  in  Biostatistics  herself. 

A  very  warm  thank  you  goes  to  my  good  friends  Shelley  Katz  and  her  beautiful 
children  Melanie  and  Jake  for  being  such  wonderful  friends. 

I  thank  the  Consortium  to  Establish  a  Registry  for  Alzheimer's  Disease  (CERAD) 
for  sharing  their  precious  data  on  Alzheimer's  Disease  for  demonstration  of  the  pro- 
posed methodology  in  this  dissertation.  It  is  important  to  point  out  that  CERAD 
would  not  exist  were  it  not  for  the  dauntless  perseverance  of  the  biostatistical  team 
under  the  leadership  of  Dr.  Albert  Heyman  at  Duke  University  Medical  Center  and 
for  the  NIA  grant,  #  AG06790. 

Finally  I  would  like  to  extend  a  special  thanks  to  Dr.  Myron  and  Ellen  Katzoff 
for  awarding  me  the  Annette  Kendall  Katzoff  Fellowship.  Without  the  fellowship 
it  would  have  been  difficult  to  participate  in  Alzheimer's  Disease  research.  It  was 
an  honor  for  me  to  have  represented  the  University  of  Florida  at  the  conference, 
"Statistical  Methodology  in  Alzheimer's  Disease  Research,"  at  Lexington,  KY.  The 
trip  was  funded  through  the  Annette  Kendall  Katzoff  trust  fund.  I  hope  my  research 
has  contributed  toward  their  noble  cause. 


V 


TABLE  OF  CONTENTS 


ACKNOWLEDGMENTS    iv 

LIST  OF  TABLES   viii 

ABSTRACT    xii 

CHAPTERS 

1  INTRODUCTION    1 

1.1  The  Problem    1 

1.2  Common  Practice    1 

1.3  Literature  Review    2 

2  TOPIC  OF  RESEARCH    8 

2.1  Data  Structure    8 

2.2  Hypotheses  of  Interest    9 

2.3  Derivation  of  the  Locally  Most  Powerful  Test    10 

3  ASYMPTOTICS  UNDER  Fo    15 

3.1  Notation  in  Terms  of  Counting  Processes    15 

3.2  Stochastic  Integrals    17 

3.3  Weak  Convergence  of  Tp^  under  Hq    23 

4  ASYMPTOTICS  UNDER    35 

4.1  Introduction    35 

4.2  The  Weighted  Logrank  Statistic    36 

4.3  Asymptotics  Under  a  Sequence  of  Alternatives    38 

4.4  Verification  of  Gill's  (1980)  Sufficient  Conditions    41 

4.5  Derivation  of  Efficacy    52 

4.6  Computed  AREs  and  Efficacies    58 


vi 


5  MONTE  CARLO  STUDIES    75 

5.1  Introduction    75 

5.2  Monte  Carlo  Power  Calculations  When  p  is  Assumed  Known  . .  75 

5.3  Monte  Carlo  Power  Calculations  When  p  is  Unknown    91 

5.4  Estimation  of  Location-Shift    93 

6  EXAMPLE    94 

7  SUMMARY    97 

REFERENCES    98 

BIOGRAPHICAL  SKETCH   100 


vii 


LIST  OF  TABLES 


Table  Page 

4.1  EFFICACIES  for  ai  =  0  &  ^2  =  0,      7)  =  (L,  L)   62 

4.2  AREs  for  ai  =  0  &  a2  =  0,      7)  =  (L,  L)   63 

4.3  EFFICACIES  for  ai  =  2  &  ^2  =  2,      7)  -  (L,  L)   64 

4.4  AREs  for  ai  =  2  &  a2  =  2,      7)  =  (L,  L)   65 

4.5  EFFICACIES  for  ai  =  4  &  ^2  =  4,      7)  =  (L,  L)   66 

4.6  AREs  for  ai  =  4  &  a2  =  4,      7)  =  (L,  L)   67 

4.7  EFFICACIES  for  ai  =  0  &  a2  =  0,      7)  =  (E,  E)   68 

4.8  AREs  for  ai  =  0  &  a2  =  0,      7)  =  (E,  E)   69 

4.9  EFFICACIES  for  ai  =  2  &  as  =  2,      7)  =  (E,  E)   70 

4.10  AREs  for  di  =  2  &  a2  =  2,      7)  -  (E,  E)   71 

4.11  EFFICACIES  for  ai  =  4  &  ^2  =  4,      7)  =  (E,  E)   72 

4.12  AREs  for  ai  =  4  &  ^2  =  4,  (e,  7)  =  (E,  E)   73 

5.1  Observed  a  for  f{x)  =  Logistic,  Reps  =  3000   78 

5.2  Observed  a  for  f{x)  =  EMV,  Reps  =  3000   78 

5.3  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300,  f{x)  = 
Logistic   83 

5.4  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300,  f{x)  = 
Logistic   83 

5.5  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300,  f{x)  = 
Logistic   83 


viii 


5.6  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300,  f{x)  = 
Logistic   83 

5.7  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400,  f{x)  = 
Logistic   84 

5.8  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x)  = 
Logistic   84 

5.9  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400,  f{x)  = 
Logistic   84 

5.10  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x)  = 
Logistic   84 

5.11  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic   85 

5.12  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic   85 

5.13  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic   85 

5.14  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  —  Logistic   85 

5.15  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic   86 

5.16  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic   86 

5.17  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic   86 

5.18  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  '  Logistic   86 

ix 


5.19  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300,  j{x)  = 


EMV   87 

5.20  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  U  =  300,  L2  =  300,  f{x)  = 
EMV   87 

5.21  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300,  f{x)  = 
EMV   87 

5.22  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300,  f{x)  = 
EMV   87 

5.23  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400,  f{x)  = 
EMV   88 

5.24  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x)  = 
EMV   88 

5.25  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400,  /(x)  = 
EMV   88 

5.26  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x)  = 
EMV   88 

5.27  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300, 
fix)  =  EMV   89 

5.28  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300, 
fix)  =  EMV   89 

5.29  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300, 
fix)  =  EMV   89 

5.30  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300, 
fix)  =  EMV   89 

5.31  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400, 
fix)  =  EMV   90 


X 


5.32  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  =  EMV   90 

5.33  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Lj  =  300,  L2  =  400, 
/(x)  =  EMV   90 

5.34  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400, 
fix)  =  EMV   90 

5.35  Empirical  powers  for  p  =  4  (upper  one-sided)  when  p  is  estimated.  ...  93 

5.36  Empirical  powers  for  p  =  0.25  (upper  one-sided)  when  p  is  estimated. .  93 

6.1  Data:  CERAD   96 

6.2  Analysis:  CERAD    96 


xi 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment 
of  the  Requirements  for  the  Degree  of 
Doctor  of  Philosophy 

COMBINATIVE  RANK-BASED  TESTS  FOR  COMPARING  RESPONSE  RATES 
AND  RESPONSE  DURATIONS  IN  RANDOMIZED  CLINICAL  TRIALS 

By 

Robin  Mukherjee 
May  1999 

Chairman:  Myron  N.  Chang 
Major  department:  Statistics 

In  randomized  clinical  trials,  one  could  be  interested  in  testing  a  location  shift  in 
the  distribution  of  "response  durations"  after  a  drug  has  been  administered,  and  a 
location  shift  in  the  log-odds  of  "response  rates"  between  two  groups  (men-women  or 
treatment-control).  Often  such  tests  are  carried  out  separately.  Another  approach  in 
analyjzing  data  in  the  above  structure  is  by  pooling  the  nonresponders  with  respon- 
ders  and  performing  a  classical  two-sample  test  for  right  censored  data  by  assigning 
nonresponders  with  response  durations  equal  to  0.  Owing  to  a  lack  of  power  in  both 
the  above  approaches  we  propose  a  rank-based  Locally  Most  Powerful  Test  (LMPT) 
for  testing  the  two  hypotheses  simultaneously. 

The  test  statistic  is  represented  as  a  stochastic  integral,  and  martingale  theory 
is  used  for  deriving  asymptotic  properties  of  the  test  statistic  under  of  contiguous 
sequence  alternatives.  A  Monte  Carlo  study  shows  that  the  proposed  tests  are  more 
powerful  than  traditional  tests  under  a  variety  of  conditions.  Settings  resulting  in  a 
LMPT,  viability,  and  scope  of  application  are  also  discussed. 


xii 


CHAPTER  1 
INTRODUCTION 

1.1    The  Problem 

In  drug  efficacy  trials  a  drug  may  be  evaluated  on  the  basis  of  its  respponse  rate 
and  response  duration.  In  particular,  researchers  may  be  interested  in  comparing  the 
proportion  of  responders  and  the  response  duration  of  responding  subjects  from  two 
populations.  For  instance,  in  cancer  clinical  trials  one  may  be  interested  in  comparing 
two  treatments  on  the  basis  of  the  proportion  of  subjects  who  achieve  remission  as 
well  as  the  duration  of  remission  for  those  subjects  who  have  achieved  remission.  In  an 
industrial  setting  one  could  be  interested  in  comparing  two  manufacturing  processes 
on  the  basis  of  the  rate  of  non-defective  products  as  well  as  the  life-length  of  non- 
defective  items.  In  both  examples,  there  are  two  end  points:  one  associated  with  the 
proportion  and  the  other  associated  with  failure  times,  possibly  right  censored. 

1.2   Common  Practice 

Two  common  approaches  for  comparing  two  samples  based  on  response  rates  and 
response  durations  are 

1.  Separate  two-sample  tests  comparing  proportions  and  the  response  durations 
are  performed.  In  this  dissertation  these  tests  will  be  called  "separate"  tests. 

2.  The  subjects  who  fail  to  respond  are  treated  as  uncensored  responders  with 
response  duration  zero,  and  a  two-sample  test  for  right  censored  data  is  per- 
formed. Prom  here  on,  these  tests  will  be  referred  to  as  "pooled"  tests. 


1 


2 


A  goal  of  this  dissertation  is  to  demonstrate  that  separate  tests  are  likely  to  be 
less  powerful  than  "pooled"  tests.  However,  it  will  also  be  shown  that  pooled  tests  are 
not  always  the  most  powerful.  A  new  two-sample  combined  test  based  on  response 
rates  and  response  durations  is  proposed.  Combined  tests  will  be  shown  to  be  locally 
most  powerful  under  specific  settings. 

1.3    Literature  Review 

Linear  rank  statistics  (Prentice  1978)  have  been  developed  for  tests  on  regression 
coefficients  with  censored  survival  data.  These  statistics  arise  as  score  statistics  based 
on  the  marginal  probability  of  a  generalized  rank  vector. 

Let  us  consider  the  following  two-sample  life  testing  situation.  We  have  on  test 
L  subjects,  Li  of  them  from  population  1  and  L2  =  L  —  L\  from  population  2. 
Assume  no  ties  among  uncensored  observations.  Let  t{i)  <  •  ■  <  t(^k)  be  the  distinct 
ordered  survival  times  for  the  uncensored  subjects  in  the  combined  sample  and  let 
rui  be  the  number  of  subjects  censored  in  the  interval  for  i  =  0,  •  •  • ,  A;, 

where  t(o)  =  —00  and  t(jfc+i)  =  00.  Let  Z(^i)  =  0  or  1  according  to  whether  the 
subject  failed  at  t(t)  is  from  population  1  or  2  and  similarly  let  2(y),  for  j  =  1,  •  •  • ,  mj, 
be  the  corresponding  sample  indicators  for  the  subjects  censored  in  the  interval 
Note  that  L  =  k  +  Ei=o"^i-  Finally,  let  nu  be  the  number  of  population 
I  subjects  and  rii  be  the  total  number  of  subjects,  at  risk  at  time  . 

Let  us  fix  the  somewhat  nonstandard  notation  in  relation  to  the  definition  of  a 
generalized  rank  vector,  R,  associated  with  censored  and  uncensored  observations, 
with  the  aid  of  the  following  example.  Suppose  the  values  of  our  observations  are 
112, 69+,  32, 112+  with  +  indicating  censoring.  Then  we  have  =  32,  t(2)  =  112, 
^11  =  69,  t2i  =  112.  Kalbfleisch  &  Prentice  (1973)  take  the  point  of  view  that  it  is  the 
rank  vector  of  the  underlying  uncensored  values  t^i),  that  are  only  partially  observed 
because  of  censoring,  is  of  primary  interest.  For  instance,  the  rank  vector  underlying 


3 


the  above  example  is  known  to  be  an  element  of  the  following  set  of  arrangements  of 
ranks  of  the  observations: 


where  (3,2,1,4)  indicates  that  the  3rd.,  the  2nd.,  the  1st.,  and  the  4th.  indi- 
viduals have  the  ranks  1,  2,  3,  and  4,  respectively.  The  generalized  rank  vector 
R  =  (Ri,---,Rk)  is  the  collection  of  possible  underlying  rank  vectors  such  as  in 
(1.1).  The  probability  of  R  is  the  sum  of  the  probabilities  of  the  underlying  rank 
vectors. 

For  the  two-sample  setting  Prentice  (1978)  considered  the  accelerated  failure  time 
model 


where  y  =  \og{t),  for  the  event  time  t;  9,  /?  and  a  are  parameters;  z  is  the  population 
indicator;  v  is  a.  random  variable  with  density  function  /{w)  and  survival  function 
F*{w)  =  f{u)du.  Prentice  derived  the  rank-based  LMPT  for  testing  the  hypoth- 
esis Ho  :  /3  —  0,  which  implies  that  the  distribution  functions  for  the  two  populations 
are  the  same.  Let  Tk  be  the  region  0  <  Ui  <  •  •  •  <  Wfc  <  1, 


{(3, 


2, 1,4),  (3, 1,2, 4),  (3, 


1,4,2)}, 


(1.1) 


y  —  6  +  I3z  +  au, 


(1.2) 


f'{{Fr\l-u)} 

/{(F*)-Hi-«)}' 


and 


$(m) 


f{{FT\l-u)} 


1-u 


Prentice'i 


s  statistic  has  the  form: 


k 


(1.3) 


1=1 


4 


where  Sj  =  E  Zij, 


Ci  =  /  4>{ui)  n{"i(l  -  Uj^'duj}  =  E{<t>{ui)} 


(1.4) 


and 


Ci=  f  $(«i)  n{n,(l  -  u^r^duj}  =  £;{$(ui)} 


(1.5) 


.  prentice  (1978)  derived  expressions  for  Cj  and  Ci  for  specific  densities  /  using  tiie 
relationship 


Note  that  with  scores  Cj  and  Ci  generated  by  the  true  underlying  density  function,  the 
test  based  on  Tp  is  locally  most  powerful.  If  the  underlying  density  is  misspecified, 
Tp  can  still  be  used  for  testing  Hq  but  the  loss  of  efficiency,  however,  may  be  a 
concern.  The  problem  of  asymptotic  relative  efficiency  has  been  discussed  by  several 
authors  including  Birnbaum  &  Laska  (1967),  Gastwirth  (1970)  (pages  89-109),  Lee 
et  al.  (1975),  Prentice  (1978),  and  Leurgans  (1983). 

1.3.1    Some  Special  Cases  of  Tp 

1.  For  the  logistic  density  f(x)  =  e^(l  +  e*)~^,       — oo  <  a:  <  oo, 


(1.6) 


0(u)  =  2u  —  1  and   $(u)  =  u. 


Using  (1.6), 


Hence 


Ci  =  l-  25(f(i)) 


(1.7) 


i 


5 


and 


Ci  =  1  - 


(1.8) 


Note  that  S{t)  is  the  estimator  of  F{t),  shghtly  different  from  the  Kaplan-Meier 
(1958)  estimator.  The  statistic  Tp  takes  the  form: 


tpp=y: 


i=l 


Z(,){l-25(i(,))}  +  s,{l-5(<(,))} 


(1.9) 


which  is  the  Peto  &  Peto  (1972)  generahzation  of  Wilcoxon  statistic.  In  the 
case  when  there  are  no  censored  observations 


Tpp  =  -E  {nmL  +  1)-^  -  1))  =  -;^{ E ^  -  ^^^^^},  (1-10) 


where  Ri^s  are  the  ranks  of  the  observations  from  sample  1. 


2.  Under  the  extreme  minimum  value  density 


Again 


Hence 


and 


f{x)  =  e^^  ^""^       -oo  <  a;  <  oo, 


0(it)     -  log(l  -  u)  -  1  and    $(it)  =  -  log(l  -  u). 


E{\og{l-u,)}  =  -'Zn-\ 

3=1 


j=l 


(1.11) 


j=i 


(1.12) 


6 


The  statistic  Tp  takes  the  form: 


k 

1=1 


(1.13) 


When  there  is  no  censoring  then  the  statistic  takes  the  form: 

i=i      ^j=i  ' 
which  is  the  well-known  logrank  or  the  Savage  (1956)  statistic. 
1.3.2   An  Alternative  Representation  of  Tp 

Prentice  &  Marek  (1979)  expressed  Tp  in  terms  of  observed  and  conditionally 
expected  number  of  failures  (conditional  on  the  total  size  of  the  risk  sets,  at  a  given 
failure  time).  They  set 

=  ni(Ci_i  -  Cj),    i  =  l,  (1.14) 
with  Co  =  0.  Substitution  in  equation  (1.3)  yields, 

k  k 

=  E -  Ci  -  Wi)  +  5]  Wj(2;(j)  -  nxjuf).  (1.15) 

Note  that  the  second  term  in  equation  (1.15)  is  a  weighted  sum  of  observed  number  of 
failures,  in  the  treatment  group  at  ty)  minus  the  conditionally  expected  number 
of  failures,  rii^nj^,  given  the  risk  set  sizes  nij  and  Uj  and  under  Hq.  If 

niCi-x=       {ni-\)Ci,    i  =  l,---,k,  (1-16) 

then  the  first  term  in  (1.15)  equals  0  and  Tp  has  a  particularly  attractive  form  in 
terms  of  observed  and  expected  number  of  failures.  It  can  be  shown  that  the  scores 


corresponding  to  Tpp  and  Ti  (see  equations  (1.9)  and  (1.13))  satisfy  condition  (1.16). 
For  scores  generated  from  the  logistic  density  as  given  in  equations  (1.7)  and  (1.8), 


Uid-i  =  nJl  -  Jl  — f-} 


i-2n(;;f^)^(".->){i-n(;;f^)} 

a  +  im  -  i)Ci.  (1.17) 


Hence, 


Tpp  =  TpM-pp  =  E^^(^(i)  -  — )'  (1-18) 


where  =  ni(Ci_i  -  Q)  =  a  -  Q  =  -S{t(i)).  Using  similar  arguments,  it  is  easy 
to  show  that  scores  generated  from  the  extreme  minimum  value  density  also  satisfy 
condition  (1.16).  Hence, 


r,  =  TpM-L  =  -E(^w-— )  (1-19) 


i=l  ■  «i 


Mehrotra  et  al.  (1982)  in  a  study  of  relationship  between  Tp  and  TpM  show  that, 
in  general,  the  optimal  scores,  q  and  Q,  given  by  Prentice  (1978)  do  satisfy  (1.16). 


CHAPTER  2 
TOPIC  OF  RESEARCH 

2.1    Data  Structure 

Assume  that  the  number  of  subjects  available  in  Group  1  and  Group  2  are  Li  and 
L2  {L  —  Li  +  L2)  and  that  the  number  of  responders  in  Group  1  and  Group  2  are  Kl 
and  K2  respectively,  where  both  and  K2  are  random  (K^  +  —  K*).  Clearly, 
when  referring  to  responders,  K*  is  the  size  of  the  risk  set  at  time  0  in  the  combined 
sample.  Conditional  on  Kl  =  fc*  and  =  k^,  we  are  able  to  observe  "response 
durations"  from  each  group.  Let  ki  and  ^2  be  the  number  of  uncensored  response 
durations  in  the  respective  groups  {k  —  ki  +  k2).  Note  that  ki  <  fc*,  i  =  1,2,  and  k  < 
k*.  Assume  no  ties  among  uncensored  response  durations.  Let  X(i),  •  •  -  ,X(^k)  be  the 
ordered  uncensored  log-response  durations  in  the  pooled  sample.  Let  {Xa,  •  •  • ,  Xj^,.) 
be  the  unordered  censored  log-response  durations  in  i  =  0,  -  •  •  ,k,  where 

X(o)  =  -00  and  X(k+i)  =  00.  Note  that  k*  =  k  +  Ei=o  "^i-  Also,  let  Z(i)S  and  ZijS  be 
the  group  indicators  associated  with  X(j)S  and  XjjS,  respectively.  Let  the  response 
rates  in  the  respective  groups  be  pi  and  p2-  Assuming  that  distribution  functions 
for  response  durations,  Fi  and  F2,  are  absolutely  continuous,  let  the  corresponding 
densities  in  the  respective  groups  be  /i  and  /2. 


8 


9 


2.2    Hypotheses  of  Interest 

We  consider  the  following  null  hypothesis  Hq  and  the  alternative  hypothesis  H^- 
Ho  :  fiix)  =  /2(x)  =  fix)  and     =  p2  =  P,  (2.1) 

versus 

f  fi{x)=  f{x  +  caiA) 
f2{x)  =  fix  -  ca2A) 


(2.2) 


where  A,  c,  d,  ctj  and  ^  (i  =  1, 2)  are  nonnegative  parameters  with  ai+a2  =  ^1+^2  = 
1.  Under  if^j 

/2(x)  =  /i(x-cA) 

and 

P2(l  -Pl) 

Note  that  the  key  issue  in  the  above  alternative  hypothesis  is  that  c  need  not  be  equal 
to  d.  In  other  words,  the  degree  of  shift  in  the  failure  times  distributions  and  that  in 
the  log-odds  ratios  could  be  diiferent. 

Effectively  we  will  be  interested  in  testing  the  hypotheses  Hq  :  A  =  0  vs  Ha  • 
A  >  0.  Owing  to  the  nature  of  the  hypotheses  it  is  reasonable  to  derive  a  LMPT. 
The  following  section  discusses  the  derivation  of  the  LMPT  statistic. 


10 


2.3   Derivation  of  the  Locally  Most  Powerful  Test 


In  the  uncensored  data  case  Chang  et  al.  (1994)  defined  the  burden-of-illness  as 
the  sum  of  severity  scores  over  all  cases  in  each  group.  In  their  context,  burden-of- 
illness  represented  a  measure  of  total  morbidity  due  to  the  disease  in  each  group.  The 
test  proposed  by  Chang  et  al.  (1994),  however,  is  sensitive  to  the  severity  scoring 
method.  In  particular,  the  test  may  have  poor  local  power  when  the  underlying 
distribution  of  severity  score  has  heavy  tails.  One  alternative  is  to  use  the  ranks  of 
the  severity  scores  instead  of  their  actual  measurements.  In  this  manuscript,  severity 
scores  correspond  to  response  duration. 

The  generalized  rank  vector  R,  as  discussed  in  section  1.3,  plays  a  crucial  role  in 
the  construction  of  the  LMPT.  Rank-based  tests  are  robust  to  misspecification  of  the 
underlying  distribution  of  response  durations. 

In  order  to  derive  the  locally  most  powerful  test  we  will  use  the  Neyman-Pearson 
(N-P)  lemma.  By  the  N-P  lemma,  the  statistic  for  testing  Hq  versus  a  fixed  is 


The  contents  within  the  first  curly  brackets  of  equation  (2.3),  denoted  as  Pa(^)/Po(t*)j 
will  be  addressed.  By  Taylor's  expansion,  in  the  notation  of  Prentice  (1978), 


jPH^{R  =  r\K*^kl,K*  =  kl) 
\PH,{R^r\Kt  =  kl,K^  =  k() 


PH,{Kl  =  klK*^=ki) 


(2.3) 


dpA{r) 
dA  A=o 


-f  Ao(l). 


Dividing  both  sides  by  po{r),  we  have 


(2.4) 


11 


Indeed,  PAir)  =  Ph^{R  -  rlK^  =  kl,K2  =  A;2)  is  the  probability  of  the  (random) 
rank  vector,  p{r),  in  the  notation  of  Prentice  (1978)  and  can  be  expressed  as 

PA{r)  =  Ph^  [R  =  r\K{  =  kl  K*  -  ki) 
=    /  n  f   /(^(i)  -  cA(z(i)  -  ai))  n  (l  -  F{x^i)  -  cA{z^ij)  -  ai))\dx^i)  , 

(2.5) 

where  is  the  region  <  •  •  •  <  X(^k)-  In  equation  (2.5)  we  are  making  the  assump- 
tion that  all  censoring  happens  at  the  left  end  point  of  each  interval  of  censoring,  i.e., 
Xij  =         Vj  =  1,  •  •  • ,  mj.  By  straightforward  calculations. 


Po 


(r)  =  Po{R  =  r\Kl  =  kl  K*  =  k*)  =  (2.6) 


n  rii 
t=l 


where  rii  is  the  size  of  the  risk  set  at  log-time  •  Following  equations  (2.4),  (2.5), 
(2.6),  and  the  approach  on  page  170  of  Prentice  (1978), 


PA{r) 


Po{r) 

where  o(l)  ->•  0  as  A  ^  0, 


=  1  +  Ac  5:  (^(,)C,  +  Y:  ZijQ)   +  Ao(l),  (2.7) 

i=l  7  =  1  ^ 


and 


i  =  l[-  ^^^^^)  n  {n./(x(,))F-^(x(,))dx,)},  (2.8) 


C,=  ff.  dlo|^\  ^  ^^j^^^^^^pm^^^^^^^^^^Y  (2.9) 

k 

In  the  derivation  of  equation  (2.7),  we  have  used  the  fact  that  E(ci  +  Tnn.iCi)  =  0. 

t=i 

Note  that  the  second  term  in  equation  (2.7)  does  not  depend  on  the  choice  of  ai  and 


12 


a2.  In  equation  (2.7),  Ci  is  the  score  for  an  uncensored  observation  and  Cj  is  tlie 
score  for  a  censored  observation  at  x^i)+.  The  content  in  the  second  curly  brackets 
of  equation  (2.3)  is 


1  + A 


r  1  dP^{Kt  =  kiK^  =  k^) 

[po{Kl  =  klK^  =  k*2)  dA 


+    Ao(l).  (2.10) 

Clearly,  Po[KI  =  kl,K^  =  fc^)  =  Po[KI  =  kl)Po[K^  =  ^2)  is  the  product  of  two 
binomial  probabilities: 

i'.(xr  =  *;,if;  =  *;)  =  (gg)p'- (!-#-•.  (2.11) 


Similarly, 


p^{Ki  =  ki,K;  =  k;)  =  n  i^P^' ^  -  P^''~'' ■  (2.12) 


From  equation  (2.2)  we  see  that, 


I.e., 


ioi?^  = -d0,A,  (2.13) 

P{1  -  Pi) 


Similarly, 


13 


Using  equations  (2.10),  (2.11),  (2.12),  (2.14)  and  (2.15), 


Po{Kl  =  kl,K^  =  k^) 


=  1  +  Ad 


+  Ao(l). 


(2.16) 


By  the  results  in  equations  (2.7)  and  (2.16),  we  can  write  equation  (2.3)  as: 


PH^{K*  =  klK*,=k*,,R^r) 
PHo{Kl  =  kl,K^  =  klR  =  r) 


=  1  +  Ac 


+ 


p{{^<,k\-^^k\)-{^^U-^rU)v\ 


+  Ao(l), 


(2.17) 


where  p  —  d/c.  Since  the  first  term  in  (2.17)  is  a  constant  and  the  third  term  is 
negligible  for  small  A,  the  LMPT  statistic  (see  Randies  k  Wolfe  1991,  page  295)  for 
/fo  :  A  =  0  versus  i^A  :  A  >  0,  is  the  statistic  in  the  brackets  of  (2.17),  i.e., 


r  =  rp  +  pTB, 


(2.18) 


where 


k 

E 

1=1 


TTli 


j=i 


and 


Tb  =  {{P2k*2  -  Pikl)  -  {P2L2  -  PiL,)p}. 


Note  that  the  LMPT  is  based  on  the  number  of  responders  k*,  A;*  and  the  generalized 
rank  vector  R.  If  we  pick  /?i  =  L2/L  and     =  Li/L,  then  Tb  takes  the  form 


L\L2  f  ^2  ^1 


U,o  T.J' 


L     L2  Li 


14 


Hereafter,  our  test  statistic  will  be  referred  to  as  T  and  is  given  by 
T  =  Tp  +  pTb 


|:(.«c,.|.,a),.^{|-|} 


(2.19) 


where  p  —  djc.  Note  that  the  locally  most  powerful  test  T  for  testing  the  hypothesis 
(2.1)  versus  (2.2)  depends  on  p. 

It  is  seen  from  the  derivation  that  the  test  statistic  does  not  depend  on  the  choice 
of  OL.  However,  the  choice  of  /3  affects  the  structure  of  the  test  statistic.  There  could 
be  concern  regarding  the  choice  of  ^.  Let  us  consider  two  different  choices  of  /?, 
namely  and  (/3i,/?2)-  We  note  that, 

=  {(^A:;  -  p,kl)  -  {P2L2  -  A^i)p}  -  {m*2  -  P'lK)  -         -  P'Mp] 

=  {kl-L,p){P[-p,)-{k;-L2pM-P2) 
=  {k*-LpM-0^). 

By  choosing  a  different  set  of  j3,  the  difference  in  the  test  statistic  in  (2.18)  is  pro- 
portional to  the  difference  between  the  total  number  of  responders  and  the  expected 
number  (under  Hq)  of  responders.  If  the  consistent  estimator  p  =  k*/L  is  substituted 
for  p,  then 

Tb(/3i,/?2)-Tb(/3;,/?^)  =  0. 

Hence,  it  is  reasonable  to  choose  Pi  —  L2IL  and  P2  =  Li/L.  When  Px  and  P2  are 
chosen  as  above  then  Tg  is  a  test  for  comparing  a  difference  of  two  proportions.  Also, 
as  shall  be  discussed  in  Chapter  3,  this  stucture  is  easily  written  in  the  stochastic 
integral  form.  Asymptotic  properties  are  easily  established. 


CHAPTER  3 
ASYMPTOTICS  UNDER  Hr 


In  this  chapter  we  shall  first  introduce  notation  in  terms  of  counting  processes. 
Using  the  machinery  available  for  counting  processes  we  shall  use  the  Martingale 
Central  Limit  Theorem  to  establish  the  large  sample  distribution  of  the  statistic  T 
given  in  equation  (2.19). 

3.1    Notation  in  Terms  of  Counting  Processes 

Let  Tij  be  the  log-response-duration  for  subject  j  from  group  i,  and  let  Uij  be  the 
log-censoring  time  for  subject  j  from  group  i,  z  =  1, 2;  j  =  1,  •  •  ■ ,  Lj.  Note  that  for 
nonresponders  the  log- response-duration  is  —  oo.  If  the  censoring  time  is  larger  than 
the  failure  time  then  the  failure  time  is  observed,  and  hence  is  uncensored.  If  the 
censoring  time  is  smaller  than  the  failure  time  then  we  observe  the  censoring  time, 
and  label  that  observation  as  censored.  Under  the  random  censorship  model,  Tij  and 
Uij  are  assumed  to  be  independent.  The  available  information  on  subject  j  from 
group  i  is  the  pair  of  random  variables 

Xij  =  min{Tij,  Uij)  and  6ij  =  I{Xij  =  Tij}, 

where  5ij  is  the  censoring  indicator. 

It  is  assumed  that  all  T^s  and  UijS  are  mutually  independent  and  within  group  i, 
TijS  and  UijS  are  identically  distributed  with  distribution  functions 

Fi{t)  =  P{Tij  <t)  =  l-  P{Tij  >t)  =  l-  Si{t), 


15 


16 


and 

mt)  =  p{Uij  <t)  =  i-  p{Uij  >t)  =  i-  Ciit), 

respectively,  where  Si  is  the  corresponding  survival  function  and  Cj  is  the  correspond- 
ing censoring  survival  function.  Let 

mit)  =  P{X,j  >t)^  Si{t)Ci{t), 

and 

Nijit)  =  I{X,j  <  t,5ij  =  1};       Ni{t)  =J2Nij{t)-       N{t)  =  E^^W- 

j=i  t=i 

be  the  counting  processes  that  count  the  number  of  failures  at  or  before  log-time  t  for 
the  j^^  individual  in  group  i,  and  in  both  groups,  respectively,  at  or  before  log-time 
t.  The  corresponding  counting  processes  for  censored  observations  are: 

N^Ai)  =  nXij  <  t,     =  0};       iVf  W  =  X;  N^it);       N^{t)  =  f;  iVf  (i). 

j=i  t=i 

Left-continuous  processes  Y  that  are  the  sizes  of  the  risk  sets  just  before  time  t 
are  defined  as  follows: 

Yijit)  =  I{Xi,  >  t};       Yi{t)  =  X:  Yijit);       Y{t)  =  Yi{t). 

j=i  1=1 

Hence,  Yi{t)  is  the  size  of  the  risk  set  in  group  i  at  time  {t  —  0).  Y{t)  denotes  the 
total  size  of  the  risk  set  at  time  {t  —  0). 

The  cumulative  hazard  function  is  defined  as 


Ai{t)=  I  {l-Fi{s-)}-'dFi{s). 


Note  that  this  definition  of  Ai{t)  uses  the  left-continuous  version  of  the  distribution 
function.  This  facilitates  handling  tied  observations. 


17 


3.2   Stochastic  Integrals 

In  the  previous  chapter  we  have  derived  the  structure  of  the  LMPT.  From  the 
structure  as  given  in  equation  (2.19)  we  see  that  our  test  statistic  is: 


(3.1) 


3.2.1    The  Test  Statistic  T  in  the  Stochastic  Integral  Form 


We  notice  from  the  structure  of  equation  (3.1)  that  our  statistic  is  a  linear  com- 
bination of  Prentice's  (1978)  statistic  and  a  test  for  two  sample  proportions,  or  bino- 
mial test.  Prentice  and  Marek  (1979)  and  Mehrotra  et  al.  (1982)  have  shown  that 
Prentice's  statistic  can  be  written  in  terms  of  the  observed  number  of  failures  and 
conditionally  expected  number  of  failures  (conditional  on  the  total  size  of  the  risk  set). 
Andersen  et  al.  (1982),  page  234,  have  shown  that  such  a  statistic  can  be  written  as 
a  Weighted  Logrank  Statistic  in  the  counting  processes  framework.  Specifically  the 
test  statistic  in  (3.1)  is 


T  = 


I 


yi{s)Y2{s) 


{C{s)-c{s)) 


(-00,00] 


'dN^js)  dN2{sy 


where  c(s)  and  C{s)  are  the  left  continuous  versions  of  Prentice's  scores  as  mentioned 
in  Prentice  (1978)  and  Anderson  et  al.  (1982).  Observe  that  Ni{-oo)  could  be 
interpreted  as  the  number  of  nonresponders  in  group  i,dNi{—oo)  =  Li  —  k*.  Also  note 
that 

dNi{- oo)     Li  -  k* 


Yii-oo) 


Li 


i  =  l,2.  Hence, 


^2 


 kl_  diVi(-oo)  _  dN2{-oo) 

L2    L,  ~  yi(-oo)     y2(-oo)  ' 


18 


Therefore  the  test  statistic  T  takes  the  form: 


where 


/  ,     Y(s)    ^^'\yM       YM  ' 

[-CX),00] 


W(s)  =  \^^'^-~^^'^^  (3.4) 
^  '     y  p,  s  =  -oo. 

Note  that  W{s)  is  an  adapted,  left-continuous  process,  i.e.,  predictable.  Hence,  the 
statistic  T  is  a  Weighted  Logrank  Statistic  as  discussed  in  Gill  (1980),  Andersen  et 
al.  (1982),  Harrington  and  Fleming  (1982),  and  Fleming  and  Harrington  (1991). 


19 


3.2.2    Special  Cases  And  Discussion 
Special  Cases 

As  given  in  equations  (1.7),  (1.8),  (1.11),  and  (1.12),  {C{s)  -  c{s))  has  the  form: 
for  the  logistic  density, 

where  S{t)  -  U  ^1  -  Y{s)+ij !  Extreme  Minimum  Value  density, 

C{s)  -  c{s)  =  1. 

If 

p  =  C{—oo)  —  c(— oo) 

then  one  could  treat  the  nonresponders  as  responders,  with  response  duration  zero 
(uncensored) ,  and  perform  a  classical  two-sample  test  for  right  censored  data.  Specif- 
ically, 

1.  In  case  the  scores  are  generated  from  a  logistic  density  and 

r(-oo) 


F(-oo)  + 

then  the  Peto  &  Peto  (1972)  test  with  nonresponders  treated  as  responders  is 
equivalent  to  performing  the  test  based  on  T  in  (3.3).  Hence,  for  a  location 
shift  alternative  hypothesis  with  the  underlying  distribution  of  the  data  being 
logistic,  the  Peto  &  Peto  (1972)  test  with  pooling  will  result  in  a  LMPT  only  if 
p=l. 

2.  For  scores  generated  from  the  extreme  minimum  value  density,  if 

p=l  —  C{-oo)  -  c(-oo) 


20 


then  the  logrank  test  with  nonresponders  treated  as  responders  is  the  same  test 
as  in  (3.3).  Of  course,  if  p  ^  1  in  reality  then  the  logrank  test  with  pooling  will 
result  in  a  loss  in  efficiency  for  the  test. 

3.  As  a  special  case  if  all  subjects  responded  then  it  is  easy  to  see  that  the  second 
(binomial)  component  of  the  test  statistic,  Tb,  will  be  zero  and  hence  our  com- 
bined test  based  on  T  is  Prentice's  (1978)  test,  a  classical  two-sample  test  with 
right  censored  data. 

4.  If  all  failures  occured  at  one  point,  i.e.,  if  all  failures  were  tied  at  one  failure 
time,  then  our  test  as  given  in  equation  (3.3)  is  a  binomial  test. 


21 


Discussion 

1.   Unidirectional  Hypotheses 

Tlie  alternative  hypothesis  (2.2)  on  page  9  is  an  example  of  unidirectional  hy- 
pothesis. Under  hypothesis  (2.2)  the  appropriate  test  is  based  on  T  as  given  in 
equation  (2.19).  Here  it  is  clear  that  S2{t)  >  Si{t).  Consequently,  the  estimated 
cumulative  hazard  of  group  1  is  likely  to  be  bigger  than  that  of  group  2.  In 
other  words  it  is  likely  that 


Moreover,  since  p2  >  Pi  the  sign  of  T  in  equation  (2.19)  is  likely  to  be  positive. 

2.  Bidirectional  Hypotheses 

Suppose  now  we  are  looking  at  a  situation  where  the  shifts  in  the  rates  and 
failure  times  are  in  opposite  directions.  For  example,  let  us  consider  a  cohort 
of  Alzheimer's  Disease  patients.  Let  group  I  be  the  control  group  and  group 
n  be  the  treatment  group.  Suppose  our  interest  is  in  reducing  the  rate  of 
institutionalization  in  the  treatment  arm  but  expect  distribution  of  survival 

,   times  for  those  institutionalized  in  the  treatment  arm  be  longer.  In  this  case 
the  alternative  hypothesis  can  be  written  as: 


where  all  the  parameters  and  constants  in  equation  (3.5)  have  the  same  def- 
initions as  in  equation  (2.2).  It  is  imperative  to  note  that  the  test  for  the 
hypotheses  (3.5)  is  no  longer  based  on  T;  instead  the  appropriate  test  is  now 


dNijs)  _  dNjjs) 
Yiis)  ~  ¥2(3) 


:  < 


f2{x)  =  f{x  -  CQ!2A) 


(3.5) 


22 


based  on 


■(—00,00] 


-p- 


L  Li  j 

(3.6) 


However,  it  is  now  to  be  noted  that 


_LiL2  r ^  _  ^1   ,  (Y,i-oo)Y2i-oo)  \  /d7Vi(-oo)  _  dA^(-oo)\ 

L  \Li   L^l^y    y(-oo)    yvii(-oo)     r2(-oo)  j- 

This  fact  does  not  in  any  way  hinder  our  discourse  on  asymptotic  normaUty, 
as  we  shall  see  in  section  (3.3).  Furthermore  it  is  clear  that  the  asymptotic 
variance  for  our  statistics  for  either  hypotheses  (2.2)  or  (3.5)  will  be  unchanged. 

Note  that  in  this  case  when  pooling,  nonresponders  must  be  treated  as  re- 
sponders  at  time  00  instead  of  at  time  0,  in  order  for  the  pooled  tests  to  be 
comparable  to  the  corresponding  combined  tests. 

Issues  regarding  power  of  individual  testing  situations  will  be  discussed  in  Chapter  5 
through  Monte  Carlo  simulation  studies. 


3.3   Weak  Convergence  of  Tpm  under  Hn 

In  this  section  asymptotic  normality  of  the  test  based  on  the  statistic  T  given  in 
equation  (3.2)  or  (3.3)  will  be  established.  It  should  be  noted  that  the  first  term  in 
(3.2)  is  based  on  the  response  durations  for  the  k*{=  A;*+fc2)  responders.  Fleming  and 
Harrington  (1991),  Corollary  7.2.1,  page  260,  give  three  conditions  that  are  necessary 
for  weak  convergence  of  the  weighted  logrank  statistic.  Since  the  statistic  T  has  a 
jump  at  -oo,  direct  application  of  Corollary  7.2.1,  Fleming  and  Harrington  (1991) 
is  justified  through  discussions  in  Helland  (1982).  Helland  showed  that  any  process 
that  has  a  finite  number  of  jumps  in  [—oo,  log{t)]  of  sizes  greater  than  e*  >  0,  can 
be  handled  by  discretizing  the  process  at  the  jump  points.  Discrete  and  continuous 
parts  can  be  addressed  separately. 

Definition  3.1  Set  I  =  {t :  TTi{t)TT2{t)  >  0}. 

Definition  3.2  Set  u  =  supl,  so  u  ^  X  when  m  =  oo.  Denote  r  =  inf{s  :  ^1(5)  A 
Y2{s)  =  0}. 

Consider  the  test  based  on  the  statistic  given  in  equation  (3.3).  The  normahzed 
structure  of  the  statistic  is 

[-oo,t] 

In  proving  weak  convergence  of  the  statistic  given  in  equation  (3.7)  we  will  make  the 
following  assumption. 

Assumption  3.1  For  i  =  1,2  there  exists  a  constant  Cj  6  (0, 1)  such  that  as  L  ->  00 


24 


The  following  Lemma  can  be  shown  by  arguments  in  the  proof  of  theorem  5.5.1,  page 
133,  Chung  (1974). 

Lemma  3.1  For  the  empirical  estimator        of  'iTi{t), 


sup 

— oo<t<oo 


Ao. 


Li 

The  following  theorem  is  a  modified  version  of  Corollary  7.2.1,  Fleming  and  Har- 
rington (1991),  page  260. 

Theorem  3.1  Suppose     =  F2  =  F.  Let 

L  Y^{s)Y,{s)^ 


Suppose  that: 
1. 


A  hi{s) 


uniformly  on  [— oo,<]  for  any  i  G  J  as  L  ^  00  ,  where  hi  is  a  nonnegative, 
left-continuous  function  with  right  hand  limits  such  that  hi{t)  <  00  and  h^',  the 
right-continuous  adaptation  of  hi  on  the  interval  I,  is  of  bounded  variation  on 
each  closed  subinterval  of  J,  and  hi  (t)  —  0  for  any  t  ^T. 

2.  If  u  ^  I,  assume 
(a) 

a\u)  =    I    (h,{s)  +  h2{s))  (1  -  AA(s))dA(s)  <  00 

[-oo,u] 

and 

(b)  for  any  e  >  0, 

limlimsupPj  /  K'{s)—^^^^dAis)  >  e]  =  0. 

[t,u] 


25 


3.  In  addition,  if  u  <  oo, 


[u,oo] 

for  any  e  >  0. 

Let 


[-00,t] 

and 


-oo,t] 

Then,  as  L     oo,  for  any  r  €  [— oo,  oo\, 

{a?(r)}V2 


iV(0,l). 


The  estimators  in  (3.9)  and  (3.10)  are  quoted  from  equations  (4.1.21)  and  (4.1.20)  on 
page  58,  Gill  (1980).  These  estimators  are  shown  to  be  consistent  estimators  for 

J     ai7ri(tj  +  a27r2(rj 

[-oo,t] 

Note  that 

Conditions  of  theorem  3.1  will  be  verified  to  establish  the  weak  convergence  of  T* 
with  scores  generated  from  logistic  and  extreme  minimum  value  densities.  In  both 
cases,  the  weight  function  W{s)  is  discontinuous  at  t  =  -oo.  Hence  in  checking  the 


26 


conditions  in  theorem  3.1  the  cases  t  >  —oo  and  t  =  —oo  will  be  addressed  separately. 
From  equation  (3.8)  it  is  clear  that  for  logistic  density 


K{t)  =  Kpp{t)  =  { 


f         \  1/2 


>  — CXD 
t  =  — OO. 


(3.12) 


When  the  scores  are  generated  from  extreme  minimum  value  density, 


K{t)  =  KL{t)  =  { 


L  )  \  Li  )  \  L2  )\Y{t))^ 
.  .  \l/2 


p. 


t  >  — oo 


t  =  — oo. 


(3.13) 


Weak  convergence  will  be  shown  only  for  Tpp.  The  arguments  for  Ti  are  similar  to 
those  for  Tpp  and  shall  be  omitted. 


27 


3.3.1    Weak  Convergence  of  Peto  fc  Peto  (1972 Vs  Version  of  Equation  (3.3). 

Arguments  verifying  the  conditions  of  theorem  3.1  will  follow  those  given  in  Flem- 
ing &  Harrington  (1991)  pages  261-263. 

For  t  >  — oo 


For  responders, 

K'ppit)  _    L   fYim.jtWl^         Y{t)    Y  1 

m  ~  LilA  Y{t)  J      'Y{t)  +  i\  Yiity  ^^-'^^ 


By  assumption  (3.1)  and  lemma  (3.1),  it  is  concluded  that  — >  hi{t)  uniformly 
on  (— oo,  t]  for  any  i  G  X  as  L  — ^  cxd,  where 

nl{t)nl{t) 


hi{t)  =  a^s^i){S{t-)y 


7ri{t)(aiiri{t)  +  a2TT2{t)y 


<  oo. 


Condition  (1)  of  theorem  (3.1)  is  satisfied. 
Note, 

AA(,)  =    ^nt)    ^  Fit) -Fit-) 
1-Fit-)       1-Fit-)  ■ 

Hence, 

which  implies  that 

J   hiil-AA)dA<    J   hidA<    j  ihi  +  h2)dA.  (3.15) 

(— oo,u]  (— oo,u]  (— oo,u] 

For  condition  (2a),  using  equations  (3.15),  we  see 

j   hiil  -  AA)dA 

(-oo,u] 


28 


<     /  (^1  +  ^2)ciA 


(-oo,u] 


y       ^  ^    VOlTTi  +  a27r2/ 


(-00,«] 


(-oo>"] 


<   —    I  S{s-)dA 


02 

(-oo,u] 

1 

<  — 
02 

<  00.  (3.16) 

Hence  cr^(w)  <  00,  i.e.,  condition  (2a)  is  satisfied. 

Condition  (2b)  follows  by  following  arguments  in  lemma  7.2.2  of  Fleming  and 
Harrington  (1991).  Note  that, 

^2  yit) 


Klp{t) 


L1L2/V  ^  'Y{t)  +  l)  V  Y{t) 
L  (Yi{t) 


Since  0(3_j)  >  0,  lemma  7.2.2  of  Fleming  &  Harrington  (1991)  and  assumption  (3.1) 
implies 

The  condition  (2b)  follows  immediately  if  AA(«)  —  0.  Since  u  ^  T,  we  can  assume 
'Ki{u)  =  0.  If  AA{u)  >  0,  the  condition  is  satisfied,  since  0(3_i)  >  0  and  TTi{u)  =  0  and 


29 


hence 

L  (Yi 


I 


Condition  (3)  is  satisfied  since  lemma  7.2.2  of  Fleming  and  Harrington  (1991) 
implies  that, 

For  t  =  -oo. 


Since  Yi{t)  is  the  size  of  the  risk  set  in  group  i  at  t—,  Yi{—oo)  =  Li,  and  Y{—oo) 
L.  The  cumulative  hazard  function 


Since 


clearly 


A{t)  =  I  {I-  F{s-)r'dFis) 


-00 


AA(t)  =  {1  -  F{t-)}-^AF{t) 


y,(-oo)       \  L        Li     \    L  j^- 


Under  assumption  (3.1),  as  L  — >  oo, 

Klp{-oo)  p 
Yi{-oo) 


hi{-oo), 


30 


where  /ij(-oo)  =  f?a[z-i)  <  oo.  Hence,  hi  is  a  non-negative  constant  (and  hence 
left-continuous,  with  right-hand  limits),  and  hi  <  oo.  Condition  (1)  of  theorem  (3.1) 
is  satisfied.  Now, 

{/ii(-oo)  +  /i2(-oo)}(l  -  AA(-oo))AA(-oo)  =  p(l  -p)p2  <  oo.  (3.17) 
Hence,  condition  (2a)  of  theorem  (3.1)  is  satisfied. 

'  At  s  —  — oo  conditions  (2b)  and  (3)  of  theorem  3.1  is  vacuous  and  hence  does  not 
need  to  be  addressed. 

Since  the  conditions  of  theorem  3.1  are  satisfied,  for  any  r  G  [—00,00], 


{cT2(r)}V2 
and  similarly 


^'^''^  ^viV(0,l). 


{a2(r)}V2 

The  appropriate  variances  are  (Tppir)  and  cfHt),  respectively,  defined  below. 

For  the  purpose  of  deriving  asymptotic  relative  eflSciency  in  Chapter  4,  (t^{t)  can 
be  expressed  as 

Ar)=    [    (     7il^'^^^  -  ^m)dm  +  pMI-p),  (3.18) 

J     \aini[t)  +  a2n2[t)  J 

{-oo,r] 

where 


k\t)  =  ]im  f-±-)K\t), 


uniformly  in  probability  on  each  closed  sub-interval  of  X. 
For  special  cases  Tpp  and  T£,  the  corresponding  A;^(i)'s  are 


klpit)  =  I 


{^^^  )}  (^I^SfSfe))'  ^>-°°  (3.19) 


t  =  —00 


31 


and 

klit) 


(  ]    t  >  -oo 

p,  t  —  —00, 


respectively. 

According  to  the  structure  of  (t^{t)  in  (3.18),  using  (3.19)  and  (3.20),  the  asymp- 
totic variances  of  Tpp{T)  and  T^(r)  have  the  forms 


Gpp 


(-oo,r] 

and 


^iH=    /    (     7,^7'^'^  u,)  (l  -  AA(t))rfA(t)  +  p^pd  -  p),  (3.22) 


(-oo,t] 


vai7ri(i)  +  a2'n2(t) 


respectively. 

Estimators  for  cr^(r)  for  the  statistic  Tpp  and  Tl  are  given  below. 


32 


Based  on  equation  (3.9)  and  (3.12),  for  Peto  &  Peto's  version  of  the  statistic, 


dlpp{t) 


AN-l\dN 


(-oo,t] 


Y-l  Y 


Y{-oo) 


Fi(-oo)y2(-oo) 


\  /      AN{-oo)  -  1\  AiV(-oo) 

)  [     y(-oo)  - 1  )  y(-oo) 


2  L     Y1Y2Y  AN-l\dN 


(-00,t] 


+  P' 


L2  (F+l)2 


Y-l  /  Y 


k*{L-k*) 
L{L-1)  ' 


and,  based  on  equation  (3.10)  and  (3.12), 


(3.23) 


=  |:L/j(^)"'((^-.,^)H^FM-^)f 


+ 


r (LiL2y^\A'_]_(.  _  AiV,(-oo)-l\AiV,(-oo) 
U  L  J    ^^^j  Fi(-oo)l,       Yi{-oo)-l  I  Yi{-oo) 


+  p 


00,t] 

2^(3-i)  fet*(i^i  -  fej) 

I'i(Z'i  - 1) 


(3.24) 


33 


Based  on  equation  (3.9)  and  (3.13),  for  logrank  version  of  the  statistic, 


'LiL2y/^Yi(t)Y2{t)  L  |V  y  \  /  AN-l\dN 


L  )      L,    L2  Y{t)j  [YiY2)[^      Y-1  )  Y 


(-oo,t] 


+ 


F(-oo)  \  /  _  AAr(-oo)  -  l\  AN(-oo) 
Y,{-oo)Y2{-oo) )  [       y(-oo)  -  1  )  r(-oo) 


J  Li 


L   Y1Y2  f^_AN-  1\  dN 


(-oo,t] 


+  p 


Lo  Y 


Y-1  J  Y 


L{L  -  1)  ' 


(3.25) 


and,  based  on  equation  (3.10)  and  (3.13), 


=  E 

i=l 


+ 


r  ifL,L2y/'Yi{t)Y2{t)  L  ANi-l\dNi 
J    \\  L  )      U    L2  Y{t)  ]  y\       Yi-  \  )  Yi 


■(-oo,t] 


\(UL2\'I\  V  1  /  AiV,(-oo)-l\AiV,(-oo) 
\\  L  )    ^^^j  Y,{-oo)y      F,(-oo)-l  ;  y,(-oo) 


2 

E 

i=l 


L  Y{Y4  I  ANi-l\dNi 


L2  F2  y.l        Fi-l  y  Yi 


+  P 


'(-oo,t] 

2^(3-^)  kl{Li  -  k*) 


L    LiiU  -  1) 


(3.26) 


34 


In  our  computations  during  Monte  Carlo  simulations  in  Chapter  5  we  will  use 
equations  (3.23)  and  (3.25)  as  the  variance  estimates  for  the  Peto  k  Peto  and  the 
logrank  versions  of  our  test  statistic  T*. 


si 


CHAPTER  4 
ASYMPTOTICS  UNDER  Ha 


4.1  Introduction 

Weighted  logrank  tests  are  consistent  against  ordered  hazards  and  stochastic  or- 
dering alternatives,  as  shown  in  Fleming  &  Harrington  (1991),  pages  265-267.  Such 
tests  have  powers  converging  to  unity  as  the  sample  size  goes  to  oo  for  any  fixed 
alternative  of  these  two  types.  As  a  result,  a  more  refined  measure  of  the  asymptotic 
operating  characteristic  must  be  used  to  discriminate  among  tests.  The  concept  of 
asymptotic  relative  efficiency  plays  a  central  role  in  the  asymptotic  theory  of  hypothe- 
sis testing,  and  is  discussed  in  the  context  of  nonparametric  tests  by  Randies  &  Wolfe 
(1991).  The  underlying  idea  involves  a  contiguous  sequence  of  alternative  hypotheses 
converging  to  the  null  hypothesis  (Andersen  et  al.  1993,  and,  Fleming  and  Harrington 
1991)  at  the  "right"  rate,  as  the  sample  size  goes  to  oo.  Under  such  a  sequence  the 
distribution  of  the  statistic  has,  asymptotically,  a  finite  mean  and  a  positive  variance. 
The  asymptotic  distribution  under  the  contiguous  sequence  of  alternatives  then  pro- 
vides approximate  operating  characteristics  of  a  test  in  large  samples  and  for  local 
alternatives,  i.e.,  alternatives  "close"  to  the  null  hypothesis.  For  our  purpose  in  this 
chapter  we  shall  consider  a  sequence  of  alternatives,  in  (2.2),  depending  on 

=  71- 

This  choice  of  a  sequence  of  local  alternatives  has  been  used  by  authors  in  the  uncen- 
sored  data  case  (Johnson  et  al.  1987). 


35 


36 


4.2   The  Weighted  Logrank  Statistic 

For  notational  clarity,  let  pf ,  F^^,  S'f,  Xf,  and  Af  be  the  proportion  of  re- 
sponders,  density  function,  distribution  function,  survival  function,  hazard  function 
and  cumulative  hazard  function  respectively  under  the  alternative  hypothesis  H^l 
for  group  i  =  1,2.  Let  be  the  weight  function  in  (3.8),  which  depends  on  L. 
Also,  as  defined  in  chapter  3,  let  A  be  the  cumulative  hazard  function  under  the  null 
hypothesis.  Of  course,  under  the  sequence  of  contiguous  alternatives  A  is  the  limiting 
value  of  Af  as  L  — >  00.  We  have  shown  in  Chapters  2  and  3  that  a  weighted  logrank 
test  is  locally  most  powerful  if  the  weight  function  is  constructed  from  the  under- 
lying distribution  of  the  data  under  Hq.  Pitman's  ARE  is  related  to  misspecification 
of  the  weight  function  K^{t)  including  possible  misspecification  of  p.  For  detailed 
information  on  ARE,  readers  are  referred  to  Randies  &  Wolfe  (1991).  In  this  chapter 
the  statistic  T*  will  be  written  as  T{t,K^),  given  by 


'  -oo,t]  [-oo,t] 


Let 

Mt{t)  =  N,{t)-   j  Y,{s)dA^{s)  (4.3) 

[-oo,t] 

be  the  Martingale  processes  for  group  i.  Now  using  the  compensators  /[_oo,t]  Y^i{s)dAf{s) 
for  the  counting  processes  A^i,  i  =  1, 2,  we  see  that 


'  -oo,t]  [—oo,t] 


- 1  -^(^)(^-')-« 

[-00,t] 


37 

[-oo,t] 

The  structure  of  T{t,  K^)  in  equation  (4.4)  will  be  useful  in  the  derivation  of  the 
efficacy  formulae  and  Pitman's  ARE. 


38 


4.3   Asymptotics  Under  a  Sequence  of  Alternatives 

In  chapter  3  of  this  manuscript ,  sufficient  conditions  for  a  Umiting  null  distribution 
were  stated  and  verified  for  T{<X),K^),  as  given  in  equation  (4.2).  Gill  (1980)  gave 
sufficient  conditions  for  asymptotic  normality  of  the  weighted  logrank  test  under  a 
contiguous  sequence  of  alternative  hypotheses.  Quoted  by  Fleming  &  Harrington 
(1991)  in  theorem  7.4.1,  page  269,  these  conditions  are  listed  below  in  theorem  4.1. 
We  verify  these  conditions  in  section  4.4  for  the  test  statistics  T(oo,  K^)  under  the 
sequence  of  alternative  hypotheses  described  in  section  4.1. 

Theorem  4.1  Consider  the  statistic  T{t,K^)  in  equation  (4.2).  In  addition  to  as- 
sumption 3.1,  suppose  that  for  i  =  1,2,  each  of  the  following  conditions  is  satisfied: 

1.  For  a  distribution  function,  F,  with  respect  to  which  each       is  absolutely 
continuous, 

sup    \Fl'{t)-F{t)\^OasL^oo.  (4.5) 

—  00<f<00 

There  exists  a  real  valued  function  7i  such  that 

(^)'''(^W-l)^7.(*)asL-.oc  (4.6) 

uniformly  on  each  closed  subinterval  of  {t :  F{t—)  <  1}  and 

j   |7i|dA  <  oo  for  alH  G  X,  i  =  1, 2.  (4.7) 

[-00,t] 

There  exists  a  left-continuous  function  6,  with  right-hand  limits  6'^  of  bounded 
variation  on  closed  subintervals  of  I,  and  ^  =  0  outside  I,  such  that 

/    T  \i/2 

f-^j    K^{t)  ^9it)  as  L^oo  (4.8) 


uniformly  in  probablity  on  a  closed  subinterval  of  J. 
Then  as  L  — >  oo  and  Vt  G  X, 


Tit,K^)-^N{fxit),a'{t)), 


where 


H{t)=   I  eis)j{s)dAis), 


[-00,t] 


with  7(5)  =  7i(s)  -  72(5),  and 


If  o-^(t)  is  defined  in  equations  (3.9)  and  (3.10),  then 


39 


(4.9) 


(4.10) 


r  (^i^i(^)+y2(5))^2(,)(i_^^(,)y^(,),  (4.11) 
.  J  .        7ri(s)7r2(s)  ^  ^ 

[-oo,t] 


a\t)-^a\t). 


(4.12) 


2.  If  u  ^  X,  assuming 
a)  For  i  =  1, 2, 


limlimsup  sup 

tt"     L-400  se(t,u] 


dAti 


<  00; 


b)  For  hi{t)  defined  in  condition  (1)  on  page  24 


j  hi{l  -  AAj)dAi  <  00. 


(4.13) 


(4.14) 


c)  For  any  e  >  0  and  i  =  1,2 


limlimsupPl  /  LJ_(iAf  >  el  =  0 

[t,u] 


d) 


/  \97i\dA 


<  00; 


(4.15) 


e) 


then  as  L  ^  oo,  (4.9)  and  (4.12)  also  hold  for  t  =  u. 


40 


limlimsupp|  r  \K^\\dAf  -  dA|  >  e}  =  0  for  any  e  >  0;  (4.16) 

L-^oo        l  Jt-  i 


3.  If  tt  <  oo,  assuming 
a)  For  i  =  1,2, 


limsup  sup 

Z/->oo  se{t,u] 


'tis) 


<  oo; 


(4.17) 


b)  As  L  — >•  oo. 


/ 

[u,oo] 


^  i 


(4.18) 


c) 


/     l/r^lldAf -dAI  AOasL-^oo; 

J[u,oo] 


(4.19) 


then,  as  L  ^  00,  equations  (4.9)  and  (4.12)  also  hold  for  any  t  e  [-00,00]. 


41 


4.4   Verification  of  Gill's  (1980)  Sufficient  Conditions 


In  this  section  we  will  verify  the  smoothness  conditions  stated  in  theorem  4.1  for 
the  statistic  (4.2).  The  arguments  that  follow  will  be  appUed  to  the  logistic  and  the 
extreme  minimum  value  densities.  The  logistic  density  and  its  corresponding  hazard 
function  is 

f{t)  =  e*(l  +  e*)-2,       -oo  <  t  <  00,  (4.20) 


and 


l  +  e* 


(4.21) 


respectively.  The  extreme  minimum  value  density  and  its  corresponding  hazard  func- 
tion is 

f{t)  =  e(*-^'\       -00  <  i  <  oo,  (4.22) 


and 


X{t)  =  e\ 


(4.23) 


respectively.  Note  that  both  logistic  and  the  extreme  minimum  value  densities  are 
absolutely  continuous.  We  assume  that  the  density  function  /  in  (2.1)  has  smooth  first 
derivatives  and  the  corresponding  hazard  function  A  is  positive  on  X.  The  following 
additional  assumptions  are  made  in  subsequent  arguments  in  this  chapter. 

Assumption  4.1  There  exists  a  constant  Q,  such  that 


\'{t) 


<  Q,  uniformly  for  t  on  J. 


Assumption  4.2  There  exists  positive  constants  Q2,  L'  <  u  and  5,  such  that,  for 
t  G  (L',  u)  and  9  <  5, 

X{t  +  9) 


<Q2 


42 


It  is  easy  to  see  that  the  assumptions  (4.1)  and  (4.2)  are  satisfied  for  the  logistic  and 
the  extreme  minimum  value  densities. 

In  order  to  verify  the  conditions  given  in  theorem  4.1  a  mixture  density  is  intro- 
duced below. 

The  Mixture  Densitv 

Based  on  the  definition  of  Xij  in  chapter  3,  the  underlying  density  for  the  random 
variable  Xij  is 

gj-ix)  =  (1  -  pf  )/{x=-cx>}  +pfft{x)I{.>-oo},  (4.24) 
under  H^  :      >  0.  Under  Hq:  11^  =  0,  the  density  function  of  Xij  is 

g{x)  =  (1  -  p)/{i=-oo}  +  pf{x)I{^y-^).  (4.25) 

Since  ^  is  a  mixture  density  we  can  write  the  corresponding  hazard  function  as  a  sum 
of  the  discrete  and  the  continuous  parts.  That  is,  we  write 

=  ^i{cont){^)  +  \^(iiscre<e)(^)'  ^  =  1,  2. 

Under  Hq,  we  also  express 

\{x)  =  \{amt){x)  +  \discTete){x)- 

Clearly,  the  pieces  for  the  above  are: 

\discrete){- OO)  =  1  -  p, 


43 


and  Xf(^discrete)i^)  =  \discrete){x)  =  0  for  X  >  -00.  For  i  =  1,2,  the  cumulative 
hazards,  under  Ha  and  Hq,  can  respectively  be  written  as 

Af(a:)  =  Af(eono(^)  +  (l-pf)  (4-26) 

and 

^X)  =  A{cont){x)  +  (1  -p), 

where  A.ff^cont)i^)  =     I    Af(«mt)(*)c^*'        A{cont){t)  is  similarly  defined.  Then  it  is 
(—00,1) 

clear  that  for  rc  >  —  00, 

d\f{x)  =  dA^^,){x)  =  Xll^t)ix)dx,  (4.27) 

dA(x)  =  dA(cont){x)  =  \cont){x)dx,  (4.28) 

and 

AAf  (-00)  =  1  -     ,    AA(-oo)  =  l-p.  (4.29) 

In  the  test  statistic  (4.2),  the  weight  function  K^{t)  (with  K^{t)  =  K^p{t)  for 
Peto  &  Peto  (1972),  in  (3.12),  and  K^{t)  =  Kl{t)  for  logrank,  in  (3.13))  involves 
p.  Usually,  p  is  an  unknown  parameter.  In  practice,  one  may  misspecify  p  in  (4.2). 
In  order  to  evaluate  the  loss  in  efficiency  by  misspecifying  p,  we  assume  that  a  value 
Pi,  instead  of  the  true  value  of  p,  has  been  used  in  Kpp,  and  K^-  The  density 
function  from  which  scores  are  generated  may  be  different  from  the  underlying  density 
also.  The  verification  of  conditions  in  theorem  4.1  for  the  statistics  T{oo,  Kpp)  and 
T{oo,  Kl)  will  be  done  for  the  cases  t  =  —00  and  t  >  —00  separately. 


1 


44 


Case  1:  t  =  — oo 

For  Peto  &  Peto  and  logrank  tests,  ^(-oo)  takes  the  form: 

^(-oo) 

L  \i/2. 


Note  also  that 


/        J,       \  i/'! 

=    Hm  — -  K^{-oo) 

L-*oo  \L\L2J 


=  Pi 

=  ^pp(-oo)  =  ^i(-oo).  (4.30) 


7i(-oo) 

i->oo  \  L  J     I  AA(-oo)  J 

L^oo  \    L    J       [  1-p  J 

L  J     I    l-p(l-e-''^i^^)  J 

.         =  ,H^(^)-^'    Wf).  (4.31) 

■^l^cxjV  L  /  (1-pd^iA^) 

Taylor's  expansion  of  (1  —  e"**^'^^)  around  =  0  was  used  in  (4.31).  Using  equation 
(4.1),  and  assumption  3.1  and  substituting  Pi  =  L2/L  ,  we  have  from  equation  (4.31), 

7i(-oo) 

=  fe(T)'"(f)'"(T)(i-p<iV^) 

=  pda\'^4'\  (4.32) 


Similarly,  it  follows  that 
72(-oo) 


=   -pdal^\l^\  (4.33) 


Hence, 


7(-oo)  -  ji{-oo)  -  72(-oo)  =  pd{aia2y^^.  (4.34) 
Therefore,  using  equations  (4.29),  (4.30)  and  (4.34)  we  have 

^(-oo)7(-oo)AA(-oo)  =  p,dp{l  -  p)(aia2)^/^  (4.35) 

Case  2:  t  >  — oo 

For  T{oo,K^p)  ,  the  weight  function  takes  the  form  given  by  equation  (3.12). 
Here 

=  (i^)>-)v|St}(^) 

Hence  from  equation  (4.36),  assumption  3.1,  and  lemma  3.1,  we  have, 
epp{s) 

L  \i/2 


=  lim 

L->oo 


=  ii„.{^^(,_)_ri£LU^)f^V  M 


] 

46 


Similarly  for  r(oo,  K^)  we  have 

^L\s)  -  7  r^-  (4.38) 

(ai7ri(s) +  a27r2(s)) 

Since  /(f)  is  absolutely  continuous  and  ^{{t)  =  f{t  +  cctiA^),  we  have 

dAf(O^Af'(t)_A(f  +  caiA^) 

dA(f)       A(i)  X{t)       ■  ^^■'^^^ 

Prom  the  Taylor's  expansion  in  equation  (4.39),  around      =  0,  we  see 

X{t  +  cQi A^)  _  \{t)  +  cQi A^A^(t)  +  A^o(l) 

m    ~  m 

Using  assumption  3.1,  equations  (4.1)  and  (4.40), 

=  .™  {(^)-(..,v..(i)-'^M) 

=  cai(aia2)i/2^^].  (4.41) 


It  can  be  shown  that  on  each  closed  subinterval  of  {t :  F{t-)  <  1,  the  convergence  is 
uniform.  Following  the  arguments  in  equation  (4.41),  we  see  that 


1 


t 

Using  equations  (4.41)  and  (4.42),  we  see  that 

7(0  =  7i(i)  -  72{t)  =  c(«i"2)^/'(^).  (4.43) 

Since  we  are  assuming  absolutely  continuous  distributions,  under  a  sequence  of 
contiguous  alternatives,  based  on  equation  (4.1),  condition  (4.5)  of  theorem  4.1  is 
satisfied. 

The  existence  of  ji  is  established  by  equations  (4.32),  (4.33),  (4.41),  and  (4.42). 
Further, 

I  |7i|dA=|7i(-oo)|AA(-oo)+   J  \^,\dA 

[-<».«]  (-oo,t] 


{-oo,t] 


A'(5) 


X{s) 


dA{s).  (4.44) 


The  finiteness  of  equation  (4.44),  for  t  el,\s  clear  under  assumption  4.1.  Condition 
(4.7)  is  now  satisfied. 

Condition  (4.8)  has  been  established  in  equations  (4.30),  (4.37)  and  (4.38). 

For  conditions  (4.13)  and  (4.17),  note  that  for  s  =  -oo. 


(4.45) 


dAf  _  (1  -  ) 
dA^ 

And  for  s  >  — oo, 

dAf  _  Af  _  A(^  +  caiA^) 
dA^  ~  ^~  X{t  -  ca2A^) ' 

For  equations  (4.45)  and  (4.46),  using  equation  (4.1)  and  assumption  4.2,  as  L  oo 

rfAf 


(4.46) 


dA 


L 


1  <  OO.  (4.47) 


By  equation  (4.47)  conditions  (4.13)  and  (4.17)  are  satisfied. 


48 


k 

Condition  (2b)  on  page  39  is  the  same  as  condition  (2a)  on  page  24  and  hence  has 
already  been  estabHshed. 

For  for  some  of  the  remaining  proofs  the  inequahties 

and 

will  be  used. 

Conditions  (2c)  and  (3b)  will  be  shown  for  Peto  &  Peto  (1972)  scores.  Arguments 
involving  logrank  scores  are  simpler  than  those  involving  Peto  &  Peto  (1972)  scores 
and  hence  follow  immediately.  Note 


{ii'^p(-00)P  , 


-/(l^){^H^}^(^)^l(S^).A« 

(t,u] 


(4.50) 


49 


If  t  G  (//',  u)  then  by  equation  4.49  and  assumption  4.2 


[t,u]  ^ 


< 


{t,u] 


(4.51) 


{t,u] 

where  the  latter  follows  from  Lemma  7.2.2  of  Fleming  &  Harrington  (1991). 
Condition  (2c)  is  now  satisfied.  From  the  fact  that 


•pL-  f    ^    \  Q(    )    ^^^^     -  0 


almost  certainly  on  [u,  oo)  condition  (3b)  is  satisfied. 

Recall  that  pi  is  a  misspecified  value  of  true  p.  Arguments  for  condition  (2d)  will 
involve  both  p  and  pi.  For  condition  (2d)  note 

I  \9ppji\dA 
I 

=    ^pp(-oo)7i(-oo)|AA(-oo)+    I    \9pp{thi{t)\dA{t).  (4.52) 

(-oo,c») 

Using  equations  (4.29),  (4.30)  and  (4.32)  the  first  term  in  equation  (4.52)  is 

(pipdaj^^a2^^)(l  -p)  <  oo. 
Using  equation  (4.37)  and  (4.41)  the  second  term  in  equation  (4.52)  is 


cai(aia2)^/2    J  S{t-) 


(-c»,oo) 


{aiiri{t)  +  a2'K2{t)) 


X'{t) 


dA. 


(4.53) 


50 


Under  assumption  4.1  on  page  41  and  the  arguments  in  (3.16),  the  finiteness  of 
equation  (4.53)  is  clear.  Thus  condition  (2d)  is  satisfied. 

Assuming  that  at  least  one  subject  in  each  group  responds  it  suffices  to  verify 
conditions  (2e)  and  (3c)  only  for  i  >  —  oo.  Note  that 


/  \K'pp\ 

[t,u] 


dA 


-  1 


dA 


[t,u] 


- 1  {tr'>-^^mf)-^' 


X{s  +  caiA^) 


X{s) 

A'(e)  A(e) 


[t.«] 


m  A(s) 


dA 


dA. 


Using  assumptions  4.1  and  4.2,  this  is  bounded  above  by 


lt,u] 


which,  by  equation  4.48  is,  in  turn  bounded  above  by 


[t,u] 

[t,K] 


0    (since  Ci  €  (0,1)). 


(4.54) 


By  virtue  of  equation  (4.54)  condition  (2e)  is  satisfied.  Again  using  lemma  7.2.2,  on 
page  261  of  Fleming  &  Harrington  (1991), 


lim 

L->oo 


j  7ridA  =  0 


[tj,oo] 


and  condition  (3c)  is  satisfied. 


51 


Now  all  the  conditions  of  theorem  4.1  are  satisfied.  The  structure  of  the  general 
formula  of  efficacy  will  be  derived  in  section  4.5. 


52 


4.5   Derivation  of  Efficacy 

Our  goal  in  this  section  is  to  evaluate  Pitman's  asymptotic  relative  efficiency 
(ARE)  when  the  value  of  the  parameter  p  is  misspecified.  According  to  Noether's 
theorem  (see  Randies  &  Wolfe  1991),  the  ARE  between  two  statistics,  V  and  W, 
denoted  by  AREiV,  W),  is  defined  in  terms  of  the  respective  efficacies  of  the  statistics 
V  and  W.  The  efficacy  of  the  statistic 

a{oo) 


is  defined  as: 


e{9,  oo)  = 


a{oo) 

(  J  9{sHs)dA{s)\ 
\r-oo,oo]  / 


[-00,00] 


(4.55) 


where  /x(oo)  and  a(oo)  are  defined  in  equations  (4.10)  and  (4.11),  respectively.  Using 
the  definition  of  efficacy  in  (4.55),  specific  formulae  for  efficacy  for  T(oo,  Kpp,  pi)  and 
T{oo,K^,  Pi)  will  be  derived.  Note  that  here  we  are  interested  in  efficacy  when  pi  is 
a  misspecified  value  of  p. 

4.5.1    Efficacy  Formulae 

From  Gill  (1980)  we  know  that  the  optimum  choice  of  the  weight  function  K{t) 
is  such  that 


53 


In  this  section  we  shall  give  the  formula  for  efficacy  for  the  optimal  choice  of  the 
weight  function,  based  on  (4.55).  However,  we  will  assume  that  pi  is  a  misspecified 
value  of  p. 

Efficacy  for  Peto  fc  Peto  (1972)  version  of  combined  test 
From  (4.10)  and  (4.11),  for  T{oo,  K^p,  pi), 


jupp(c50)  -    j  epp{t)j{t)dA{t) 


-oo,oo] 


(-00,00 


/  S{t)\ 


■(-00,00] 


ai'Ki{t)  +  a27r2(i) 


\X'{t)dt  +  ppip{^-p) 


(4.57) 


and 


olp{oo)  =  J 


[-00,00] 


1Tl{t)'K2[t) 


(-00,00] 


/ 


•(-00,00] 


+pIp{i-p) 


V  ai7ri(tj  +  a27r2(rj/ 


(4.58) 


If  the  underlying  distribution  is  logistic  then 


S{t) 


1 

1  +  e*' 


m  =  J 


54 


and 


\'{t)  = 


Hence  from  (4.57)  and  (4.58) 


■(— oo,oo] 


(1  +  e*)  UiTTiit)  +  a2'K2if))  (1  +  e*)2 


+PPip{l  -  p) 


=  c(aia2)^/^ 


(-00,00] 


(4.59) 


and 


a%p{oo)  = 


f         1      [     7ri(t)7r2(t)     \  e' 


(-00,00] 


dt 


+pIp{i  -  p) 


(—00,00] 


(4.60) 


Using  (4.55),  (4.59),  and  (4.60)  the  efficacy  for  a  misspecification  of  p  is 


e{Opp,p,  pi,oo) 


(?aia2 


-12 

,  /  ,T^y^^M+pp,p{i-p) 
(—00,00]  ^  ' 


(-00,00]  ^  ' 


(4.61) 


55 


Efficacy  for  logrank  version  of  combined  test 


Again  using  (4.10)  and  (4.11),  for  T{oo,  Kf;,  pi), 


/xl(oo)  =    j  eL{tHt)dA{t) 


-oo,oo] 


J     yaiTTilt)  +  a2'n-2[t) J  X{t) 


(-00,00] 


/,( 


(-00,00j 


ai7ri(t)  +  a27i'2(i) 


\X'{t)dt  +  ppip{l-p) 


(4.62) 


and 


alioo) 


I    (  Vl-AA(t))rfA(^) 


(—00,00] 


/ 


7ri(«)7r2(i) 


•(-00,00] 


ai7ri(t)  +  027r2(^) 


\\{t)dt^p\p{l-p) 


(4.63) 


If  the  underlying  distribution  is  extreme  minimum  value,  then 


and 


Hence,  from  (4.62)  and  (4.63) 


\{t)  =  e* 


A'(^)  =  eK 


/ii(oo)  =  c(oia2)^/^ 


/  < 


7ri(t)7r2(0 


(-00,00] 


ai7ri(t)  +  a2n2{t) 


^dt  +  ppip{l-p) 


(4.64) 


56 


and 


(Ti(oo)  = 


(-00,00] 


Using  (4.55),  (4.64),  and  (4.65)  the  efficacy  for  a  misspecification  of  p  is 


(4.65) 


e{OL,P,  Pi,  00) 


c^aia2 


0.00         V  z 


(-00,00) 


/  ,''(5T^fSSw)*  +  ''5P(l-P) 

—no. no        \  / 


.(-00,00] 


(4.66) 


57 


Note: 

1.  The  efficacies  given  above  are  for  the  cases  where  the  optimal  weight  function 
was  used  for  each  underlying  density,  yet,  pi  /  p.  This  will  enable  us  to  evaluate 
the  loss  in  efficiency  when  p  is  misspecified. 

2.  The  Asymptotic  Relative  Efficiency  (ARE)  of  one  statistic  T  with  respect  to 
another  W  is  defined  as  the  ratio  of  their  respective  efficacies.  In  other  words 

ARE(T,W)  =  ^^. 

e(Vv,  oo) 

ARE{T,  W)  —  1.2  implies  that  in  order  to  achieve  the  same  power  W  requires 
a  sample  size  which  is  20%  larger  than  that  required  by  T. 

3.  The  constants  (P,  ai,  and  a2  in  the  formulae  (4.61)  and  (4.66)  do  not  play  any 
role  in  the  computation  of  the  corresponding  AREs. 

4.  7ri(s)  and  7r2(s)  are  defined  in  chapter  3. 


I 

58 


4.6   Computed  AREs  and  Efficacies 

In  this  section  efficacies  for  various  settings  of  the  parameters  are  computed.  Com- 
putations are  performed  using  the  formulae  given  in  equations  (4.61)  for  Peto  &  Peto's 
version  of  our  test  statistic  and  in  equation  (4.66)  for  the  logrank  version  of  our  test 
statistic. 

Parameter  Settings 

1.  p=  (0.05,0.2,1,3,10). 

Note  that  p  is  the  true  value  of  p,  and  pi  =  (0.05, 0.2, 1, 3, 10)  are  misspecified 
values  of  p. 

2.  Densities  considered  are  : 
extreme  minimum  value  and  logistic. 

3.  Censoring  parameter  : 

Q!i  =  (0,  2,  4),  for  group  1. 
a2  =  (0,  2,  4),  for  group  2. 

Note  that  aj  =  0  ,  Qj  =  2  and  ai  =  A  imply  no  censoring,  light  censoring 
and  heavy  censoring  respectively.  The  censoring  distribution  used  to  compute 
efficacies  was  extreme  minimum  value 

L{x)  =  1  -  e(-"'f(V")-'^-'-<^/"»).    -00  <  X  <  oo,    a  >  0. 

4.  Proportion  of  responders  : 

p  —  (0.2,0.5,0.8).  Hence  we  are  looking  at  low  to  high  percentage  of  response 
rates. 


Integration  in  the  formulae  were  performed  numerically  using  the  Midpoint- Simpson 
rule  as  cited  in  Thisted  (1988).  The  algorithm  for  the  Midpoint-Simpson  rule,  given 


59 


on  page  275  in  Thisted  (1988),  was  implemented  in  our  computations.  The  prime 
reason  for  selecting  Simpson's  rule  was  due  to  the  relative  simplicity  of  our  integrals. 
For  more  complicated  integrals,  one  could  use  other  numerical  integration  techniques, 
namely  the  quadrature  fomulae.  Amongst  various  Simpson's  rules,  the  midpoint  rule 
enables  us  to  evaluate  integrals  with  singularities  at  the  end  points  of  integrals.  This 
is  especially  important  since  limits  of  integration  in  formulae  (4.61)  and  (4.66)  are 
(—00,00].  Efficacy  plots  revealed  that  efficacies  approached  an  asymptote  soon  be- 
yond log{t)  —  log{4).  Hence  all  integrals  were  evaluated  over  {— 00,  log{A)]. 

Tables  4.1,  4.3,  and  4.5  give  computed  efficacies  when  the  underlying  density 
is  logistic,  with  weights  (for  the  linear  rank  test)  also  generated  from  the  logistic 
density.  Tables  4.7,  4.9,  and  4.11,  on  the  other  hand,  give  computed  efficacies  when 
the  underlying  density  is  extreme  minimum  value,  with  weights  generated  from  the 
extreme  minimum  value  density.  In  other  words,  these  tables  give  efficacies  for  the 
locally  most  powerful  test  when  the  constant  parameter  p  is  misspecified.  Tables  4.2, 
4.4,  4.6,  4.8,  4.10,  and  4.12  give  the  corresponding  Pitman's  ARE  values. 

Description  of  Tables  4.1-4.12 

Each  table  is  split  in  three  boxes  stacked  vertically,  for  p  =  0.2,  p  =  0.5,  and 
p  =  0.8.  The  second  column  gives  the  true  values  of  p.  The  top  row  shows  values 
that  p  could  take  if  it  were  misspecified.  The  entries  in  the  body  of  the  tables  are  the 
computed  efficacies  or  AREs,  as  may  be  the  case.  Note  that  in  all  these  cases  we  are 
looking  at  efficacies  when  censoring  in  both  the  groups  is  equal.  Gill  (1980)  pointed 
out  that  this  is  the  case  when  efficacy  is  maximum. 

Each  entry  in  tables  4.2,  4.4,  4.6,  4.8,  4.10,  and  4.12  are  computed  using  the 
formula 

e//(p,  p) 


using  corresponding  entries  from  tables  4.1,  4.3,  4.5,  4.7,  4.9,  and  4.11,  respectively. 

Tables  4.2,  4.4,  and  4.6  give  the  AREs  for  misspecification  of  p  when  the  underlying 
density  is  logistic  and  the  scores  are  generated  from  the  logistic  density,  for  censoring 
parameter  a  =  0,  a  =  2,  and  a  =  4,  respectively. 

Tables  4.8,  4.10,  and  4.12  give  the  ARB's  for  misspecification  of  p  when  the 
underlying  density  is  extreme  minimum  value  and  the  scores  are  generated  from  the 
extreme  minimum  value  density,  for  censoring  parameter  a  =  0,  a  =  2,  and  a  =  A 
respectively. 

Discussion  For  Misspecification  of  p 
Efficacies 

Consider  tables  4.1,  4.3,  4.5,  4.7,  4.9,  and  4.11. 

1.  The  efficacies  behave  differently  for  p  <  1  and  p  >  1.  When  p  <  1,  for  a  fixed 
p  (say  p  =  0.05),  the  efficacies  increase  as  p  goes  from  0.2  to  0.8.  On  the  other 
hand  when  p  >  1  the  efficacies  increase  in  the  interval  (0  <  p  <  0.5)  and  then 
decrease  in  the  interval  (0.5  <  p  <  1).  This  feature  can  be  observed  if  one 
glanced  down  the  each  column  (fixed  pi)  of  each  table  for  a  fixed  values  of  p. 
One  explanation  for  this  is  when  p  <  1  d  <  c)  our  test  statistic,  given  in 
equation  (2.19),  is  dominated  by  the  shift  in  the  survival  times  (responders) . 
Hence,  when  p  increases,  i.e.,  the  proportion  of  responders  increases,  the  efficacy 
of  T{oo,K^ ,  p\)  increases  almost  as  closely  as  Prentice's  (1978)  Tp  statistic. 
When  p  >  1  (=4-  d  >  c),  it  is  clear  that  T(oo,  K^,  pi),  given  in  equation  (2.19), 
is  dominated  by  the  binomial  component  B,  which  is  a  test  for  a  difference 
of  two  proportions.  Hence  it  is  expected  that  in  such  an  event  T(oo,  K^,  pi) 
will  behave  like  a  binomial  test.  Consequently  such  a  test  will  have  the  highest 
efficacy  when  p  =  0.5. 


61 


2.  For  each  value  of  p  and  a  the  efficacies  when  pi  =  p  is  highest.  These  values 
are  the  entries  along  the  main  diagonal  in  each  table,  for  fixed  p. 

3.  Scanning  across  tables  4.1,  4.3,  and  4.5  it  is  evident  that  as  the  degree  of 
censoring  increases  from  light  {a  =  0)  to  heavy  (a  =  4)  the  efficacies  drop. 
This  decline  in  efficacy  as  the  rate  of  censoring  increases  is  more  pronounced 
when  p  <  1  than  when  p  >  1. 

ARBs 

Consider  tables  4.2,  4.4,  4.6,  4.8,  4.10,  and  4.12.  These  tables  show  the  loss  in  ARBs 
for  a  misspecification  of  p. 

1.  For  each  fixed  censoring  rate  and  proportion  of  responders,  evidently,  as  pi 
moves  farther  away  from  the  true  value  of  p,  there  is  likely  to  be  a  loss  in 
Pitman's  ARE. 

2.  The  loss  in  efficiency  is  more  sensitive  for  pi  <  1  than  that  for  pi  >  1. 

3.  Of  importance  is  the  case  when  pi  =  1.  This  is  when  the  shift  for  the  failure  time 
distribution  and  that  for  the  log-odds  are  equal.  This  is  also,  as  a  result,  the 
case  when  one  treats  the  nonresponders  as  responders  and  performs  a  classical 
two-sample  test.  The  column  for  pi  =  1  for  all  the  tables  for  ARE  shows  the 
loss  in  efficiencies  when  pi  is  misspecified. 

Logistic 

For  each  degree  of  censoring,  the  AREs  increase  as  the  rate  of  responders  in- 
creases form  0.2  to  0.8,  when  p  <  1.  However,  when  p  >  1  the  AREs  decrease 
as  the  rate  of  responders  increase  from  p  =  0.2  to  p  =  0.8.  This  is  possibly 
because  when  p  >  1  our  test  based  on  T  is  dominated  by  the  responders.  When 
rate  of  censoring  increases,  for  p  <  1,  the  AREs  decrease.  However,  when  p  >  1, 
the  ARE  actually  increases  slightly  with  an  increase  in  the  degree  in  censoring. 


62 


Table  4.1.  EFFICACIES  for  ai  =  0  &  as  =  0,  {6,  7)  =  (L,  L). 


Censoring  :  ai  =  0  &  as  =  0. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 


Pi  =  0.05 

Pi  =  0.2 

Pi  =  l 

pi  =  3 

pi  =  10 

p-0.2 

p  =  0.05 

0.06 

0.06 

0.02 

0.00 

0.00 

p  =  0.2 

0.06 

0.07 

0.04 

0.02 

0.01 

p=l 

0.08 

0.13 

0.22 

0.19 

0.17 

0.12 

0.37 

1.33 

1.50 

1.47 

p  =  10 

0.32 

2.18 

12.54 

15.75 

16.06 

p-0.5 

p  =  0.05 

0.15 

0.14 

0.07 

0.01 

0.00 

p  =  0.2 

0.15 

0.16 

0.10 

0.04 

0.02 

p  =  l 

0.17 

0.25 

0.40 

0.34 

0.28 

p-3 

0.23 

0.56 

2.03 

2.40 

2.33 

p  =  10 

0.50 

2.65 

17.59 

24.39 

25.15 

p-0.8 

p  =  0.05 

0.24 

0.24 

0.15 

0.04 

0.01 

p  =  0.2 

0.24 

0.24 

0.18 

0.07 

0.02 

p=l 

0.25 

0.30 

0.40 

0.31 

0.21 

p  =  3 

0.29 

0.46 

1.30 

1.68 

1.56 

p  =  10 

0.42 

1.27 

8.48 

15.13 

16.24 

This  is  expected,  since  censoring  only  affects  the  failure  time  distribution  and 
not  the  rate  of  responders. 
Extreme  Minimum  Value 

The  above  observations  in  relation  to  the  logistic  distribution  is  exactly  the 
same  as  that  seen  for  the  extreme  minimum  value  distribution. 


Table  4.2.  AREs  for  ai  =  0  &  ^2  =  0,  {9,  7)  =  (L,  L). 


Censoring  :  ai  =  0  &  0:2  =  0. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 

Pi  =  0.05 

pi  =  0.2 

Pi  =  l 

pi  =  3 

Pi  =  10 

p  =  0.05 

1 

1.00 

0.33 

0.00 

0.00 

p-0.2 

p  =  0.2 

0.85 

1 

0.57 

0.28 

0.14 

p=l 

0.36 

0.59 

1 

0.86 

0.77 

p  =  3 

0.08 

0.25 

0.89 

1 

0.98 

p  =  10 

0.02 

0.13 

0.78 

0.98 

1 

p  =  0.05 

1 

0.93 

0.47 

0.06 

0.00 

p-0.5 

p  =  0.2 

0.93 

1 

0.62 

0.25 

0.12 

p=l 

0.42 

0.62 

1 

0.85 

0.70 

p  =  3 

0.09 

0.23 

0.84 

1 

0.97 

p  =  10 

0.02 

0.10 

0.70 

0.97 

1 

p  =  0.05 

1 

1.00 

0.62 

0.17 

0.04 

p=0.8 

p  =  0.2 

1.00 

1 

0.75 

0.29 

0.08 

p=l 

0.62 

0.75 

1 

0.77 

0.52 

p  =  3 

0.17 

0.27 

0.77 

1 

0.93 

p=  10 

0.02 

0.08 

0.52 

0.93 

1 

64 


Table  4.3.  EFFICACIES  for  ai=2ka2  =  2,  [9,  7)  =  (L,  L). 


Censoring  :  ai  =  2  &  0:2  =  2. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 

Pi  =  0.05 

pi  -  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

0.04 

0.04 

0.01 

0.00 

0.00 

p  =  0.2 

0.05 

0.05 

0.03 

0.01 

0.01 

p  =  l 

0.06 

0.12 

0.20 

0.19 

0.17 

p  =  3 

0.10 

0.39 

1.35 

1.48 

1.46 

p  =  10 

0.35 

2.61 

13.22 

15.81 

16.05 

p-0.5 

p  =  0.05 

0.11 

0.11 

0.04 

0.01 

0.00 

p  =  0.2 

0.12 

0.12 

0.07 

0.03 

0.01 

p  =  l 

0.14 

0.21 

0.36 

0.31 

0.27 

p  =  3 

0.20 

0.56 

2.05 

2.36 

2.31 

p  =  10 

0.50 

3.08 

18.87 

24.53 

25.11 

p=0.8 

p  =  0.05 

0.18 

0.18 

0.10 

0.03 

0.00 

p  =  0.2 

0.18 

0.18 

0.13 

0.05 

0.02 

p-1 

0.19 

0.24 

0.34 

0.27 

0.20 

p  =  3 

0.23 

0.41 

1.28 

1.62 

1.53 

p  =  10 

0.37 

1.34 

9.35 

15.32 

16.18 

Table  4.4.  AREs  for  ai  =  2  &  ^2  =  2,  {6,  7)  =  (L,  L). 


Censoring  :  ai  =  2  &  0:2  =  2. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 

Pi  =  0.05 

pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p  =  0.05 

1 

1.00 

0.25 

0.00 

0.00 

p-0.2 

p  =  0.2 

1.00 

1 

0.60 

0.20 

0.20 

p=l 

0.30 

0.60 

1 

0.95 

0.85 

p  =  3 

0.07 

0.26 

0.93 

1 

0.99 

p=10 

0.02 

0.16 

0.82 

0.98 

1 

p  =  0.05 

1 

1.00 

0.36 

0.09 

0.00 

p=0.5 

p  =  0.2 

1.00 

1 

0.58 

0.25 

0.08 

0.39 

0.58 

1 

0.86 

0.75 

p  =  3 

0.08 

0.24 

0.87 

1 

0.98 

p  =  10 

0.02 

0.12 

0.75 

0.98 

1 

p  =  0.05 

1 

1.00 

0.55 

0.17 

0.00 

p=0.8 

p  =  0.2 

1.00 

1 

0.72 

0.28 

0.11 

0.56 

0.70 

1 

0.79 

0.59 

p  =  3 

0.14 

0.25 

0.79 

1 

0.94 

p  =  10 

0.02 

0.08 

0.58 

0.95 

1 

66 


Table  4.5.  EFFICACIES  for  ai  =  4  &     =  4,  {6,  j)  =  (L,  L). 


Censoring  :  cti  =  4  &;  ^2  =  4. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 

Pi  =  0.05 

Pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

0.04 

0.03 

0.01 

0.00 

0.00 

p  =  0.2 

0.04 

0.04 

0.02 

0.01 

0.01 

p=l 

0.05 

0.11 

0.20 

0.18 

0.17 

p  =  3 

0.10 

0.41 

1.36 

1.48 

1.47 

p  =  10 

0.37 

3.00 

13.66 

15.85 

16.07 

p-0.5 

p  =  0.05 

0.09 

0.09 

0.03 

0.01 

0.00 

p  =  0.2 

0.10 

0.10 

0.06 

0.02 

0.01 

p=l 

0.12 

0.20 

0.34 

0.30 

0.27 

p  =  3 

0.18 

0.58 

2.08 

2.34 

2.30 

p  =  10 

0.51 

3.49 

19.74 

24.62 

25.09 

p=0.8 

p  =  0.05 

0.14 

0.14 

0.08 

0.02 

0.00 

p  =  0.2 

0.15 

0.15 

0.10 

0.04 

0.01 

p=  1 

0.16 

0.21 

0.30 

0.25 

0.19 

p  =  3 

0.19 

0.38 

1.28 

1.58 

1.51 

p  =  10 

0.35 

1.43 

10.02 

15.43 

16.14 

67 


Table  4.6.  ARBs  for  ai  =  4  &  az  =  4,  (6,  7)  =  (L,  L). 


Censoring  :  ai  =  4  &  a2  =  4. 
Score  Generating  Distribution  =  Logistic. 
Underlying  Distribution  =  Logistic. 


pi  =  0.05 

pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p  =  0.05 

1 

0.75 

0.25 

0.00 

0.00 

p-0.2 

p  =  0.2 

1.00 

1 

0.50 

0.25 

0.25 

P=l 

0.34 

0.55 

1 

0.90 

0.85 

p  =  3 

0.08 

0.28 

0.92 

1 

0.99 

p  =  10 

0.02 

0.19 

0.85 

0.99 

1 

p  =  0.05 

1 

1.00 

0.33 

0.11 

0.00 

p=0.4 

p  =  0.2 

1.00 

1 

0.60 

0.20 

0.10 

p=l 

0.35 

0.59 

1 

0.88 

0.79 

p  =  3 

0.08 

0.25 

0.89 

1 

0.98 

p  =  10 

0.02 

0.14 

0.79 

0.98 

1 

p  =  0.05 

1 

1.00 

0.57 

0.14 

0.00 

p=0.8 

p  =  0.2 

1.00 

1 

0.67 

0.27 

0.06 

p=l 

0.53 

0.70 

1 

0.83 

0.63 

p  =  3 

0.12 

0.24 

0.81 

1 

0.96 

p  =  10 

0.02 

0.09 

0.62 

0.96 

1 

Table  4.7.  EFFICACIES  for  cn  =  0  &  as  =  0,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  0  &:  012  =  0. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 


pi  =  0.05 

pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

0.20 

0.19 

0.12 

0.03 

0.00 

p  =  0.2 

0.20 

0.20 

0.15 

0.05 

0.02 

p=l 

0.21 

0.26 

0.36 

0.28 

0.20 

p  =  3 

0.25 

0.42 

1.28 

1.64 

1.54 

p  =  10 

0.39 

1.32 

9.06 

15.26 

16.20 

p=0.5 

p  =  0.05 

0.49 

0.49 

0.34 

0.10 

0.01 

p  =  0.2 

0.50 

0.50 

0.39 

0.15 

0.04 

p=l 

0.52 

0.58 

0.74 

0.56 

0.35 

p  =  3 

0.57 

0.82 

2.08 

2.74 

2.51 

p=  10 

0.77 

1.96 

12.07 

23.30 

25.49 

p=0.8 

p  =  0.05 

0.79 

0.78 

0.67 

0.29 

0.04 

p  =  0.2 

0.79 

0.79 

0.71 

0.35 

0.07 

p  =  l 

0.80 

0.84 

0.95 

0.72 

0.34 

p  =  3 

0.83 

0.98 

1.69 

2.23 

1.86 

p  =  10 

0.95 

1.54 

6.02 

14.02 

16.79 

Table  4.8.  ARBs  for  ai  =  0  &      =  0,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  0  &  Q!2  =  0. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 

Pi  =  0.05 

Pi  -  0.2 

Pi  =  1 

Pi  =  3 

Pi  =  10 

p  =  0.05 

1 

0.95 

0.60 

0.15 

0.00 

p-0.2 

p-0.2 

1.00 

1 

0.75 

0.25 

0.10 

p=l 

0.58 

0.72 

1 

0.78 

0.56 

p  =  3 

0.15 

0.26 

0.78 

1 

0.93 

p  =  10 

0.02 

0.08 

0.56 

0.94 

1 

p  =  0.05 

1 

1.00 

0.69 

0.20 

0.02 

p-0.5 

p  =  0.2 

1.00 

1 

0.78 

0.30 

0.08 

p=l 

0.70 

0.78 

1 

0.76 

0.47 

p  =  3 

0.21 

0.30 

0.76 

1 

0.92 

p  =  10 

0.03 

0.08 

0.47 

0.91 

1 

p  =  0.05 

1 

0.99 

0.85 

0.37 

0.05 

p-0.8 

p  =  0.2 

1.00 

1 

0.90 

0.44 

0.08 

p=l 

0.84 

0.88 

1 

0.76 

0.36 

p  =  3 

0.37 

0.44 

0.76 

1 

0.83 

p  =  10 

0.05 

0.09 

0.36 

0.83 

1 

Table  4.9.  EFFICACIES  for  ai  =  2  &  as  =  2,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  2  &  0:2  =  2. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 


Pi  =  0.05 

Pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

0.10 

0.10 

0.04 

0.01 

0.00 

p  =  0.2 

0.10 

0.11 

0.07 

0.02 

0.01 

p=l 

0.12 

0.16 

0.26 

0.32 

0.18 

p  =  3 

0.15 

0.36 

1.29 

1.54 

1.49 

p  =  10 

0.32 

1.66 

11.12 

15.59 

16.10 

p=0.5 

p  =  0.05 

0.25 

0.25 

0.14 

0.03 

0.01 

p  =  0.2 

0.25 

0.26 

0.18 

0.06 

0.02 

p=l 

0.27 

0.35 

0.50 

0.40 

0.30 

p  =  3 

0.33 

0.62 

2.00 

2.50 

2.38 

p  =  10 

0.56 

2.16 

15.13 

24.03 

25.25 

p-0.8 

p  =  0.05 

0.40 

0.40 

0.30 

0.10 

0.01 

p  =  0.2 

0.40 

0.41 

0.33 

0.13 

0.03 

p  =  l 

0.42 

0.46 

0.56 

0.42 

0.24 

p  =  3 

0.45 

0.61 

1.38 

1.84 

1.65 

p  =  10 

0.58 

1.28 

7.14 

14.70 

16.40 

Table  4.10.  ARBs  for  ai  =  2  &  ^2  =  2,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  2  &  0:2  =  2. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 

Pi  =  0.05 

pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

1 

1.00 

0.40 

0.10 

0.00 

p  =  0.2 

0.91 

1 

0.64 

0.18 

0.09 

p=  1 

0.46 

0.61 

1 

0.85 

0.69 

p  =  3 

0.10 

0.23 

0.84 

1 

0.97 

p  =  10 

0.02 

0.10 

0.69 

0.97 

1 

p=0.5 

p  =  0.05 

1 

1.00 

0.56 

0.12 

0.04 

p  =  0.2 

0.96 

1 

0.69 

0.23 

0.08 

p  =  l 

0.54 

0.70 

1 

0.80 

0.60 

p  =  3 

0.13 

0.25 

0.80 

1 

0.95 

p  =  10 

0.02 

0.08 

0.60 

0.95 

1 

p=0.8 

p  =  0.05 

1 

1.00 

0.75 

0.25 

0.02 

p  =  0.2 

0.97 

1 

0.80 

0.31 

0.07 

p=l 

0.75 

0.82 

1 

0.75 

0.43 

p  =  3 

0.24 

0.33 

0.75 

1 

0.90 

p  =  10 

0.03 

0.08 

0.43 

0.90 

1 

Table  4.11.  EFFICACIES  for  ai  =  4  &  as  =  4,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  4  &  0:2  =  4. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 

Pi  =  0.05 

pi  =  0.2 

Pi  =  l 

pi  =  3 

Pi  =  10 

p-0.2 

p  =  0.05 

0.07 

0.06 

0.02 

0.01 

0.00 

p  =  0.2 

0.07 

0.07 

0.04 

0.02 

0.01 

p=l 

0.08 

0.13 

0.23 

0.20 

0.17 

p  =  3 

0.12 

0.36 

1.32 

1.51 

1.47 

p=10 

0.32 

2.05 

12.26 

15.72 

16.07 

p=0.5 

p  =  0.05 

0.17 

0.16 

0.08 

0.02 

0.00 

p  =  0.2 

0.17 

0.18 

0.11 

0.04 

0.02 

p=l 

0.19 

0.27 

0.42 

0.35 

0.28 

p  =  3 

0.25 

0.57 

2.02 

2.42 

2.34 

p  =  10 

0.51 

2.52 

17.07 

24.32 

25.17 

p=0.8 

p  =  0.05 

0.27 

0.26 

0.18 

0.05 

0.01 

p  =  0.2 

0.27 

0.27 

0.21 

0.08 

0.02 

P=l 

0.28 

0.33 

0.43 

0.33) 

0.21 

p  =  3 

0.32 

0.48 

1.31 

1.71 

1.58 

p  =  10 

0.45 

1.26 

8.17 

15.04 

16.27 

Table  4.12.  AREs  for  ai  =  4  &  as  =  4,  {6,  7)  =  (E,  E). 


Censoring  :  ai  =  4  &  as  =  4. 
Score  Generating  Distribution  =  EMV. 
Underlying  Distribution  =  EMV. 

Pi  =  0.05 

pi  =  0.2 

Pi  =  l 

Pi  =  3 

Pi  =  10 

p  =  0.05 

1 

0.86 

0.28 

0.14 

0.00 

p-0.2 

p  =  0.2 

1.00 

1 

0.57 

0.28 

0.14 

p=l 

0.34 

0.57 

1 

0.87 

0.74 

p  =  3 

0.08 

0.24 

0.87 

1 

0.97 

p  =  10 

0.02 

0.13 

0.76 

0.98 

1 

p  =  0.05 

1 

0.94 

0.47 

0.12 

0.00 

p-0.4 

p  =  0.2 

0.94 

1 

0.61 

0.22 

0.11 

p=l 

0.45 

0.64 

1 

0.83 

0.67 

p  =  3 

0.10 

0.23 

0.83 

1 

0.97 

p  =  10 

0.02 

0.10 

0.68 

0.97 

1 

p  =  0.05 

1 

0.96 

0.67 

0.18 

0.04 

p-0.8 

p  =  0.2 

1.00 

1 

0.78 

0.30 

0.07 

p=l 

0.65 

0.77 

1 

0.77 

0.49 

p  =  3 

0.19 

0.28 

0.77 

1 

0.92 

p  -  10 

0.03 

0.07 

0.50 

0.92 

1 

74 


Discussion  For  Misspecification  of  Underlying  Distribution 

In  order  to  evaluate  the  loss  in  ARE  when  the  underlying  distribution  was  mis- 
specified  we  looked  at  the  efficacies  when  pi  =  p,  and  when  the  censoring  in  each 
of  the  two  groups  was  equal,  i.e.,  cti  =  0:2.  This  way  we  will  be  removing  a  loss  in 
efficiency  due  to  a  misspecification  of  p  or  that  due  to  unequal  censoring.  All  the  loss, 
if  any,  will  be  purely  due  to  a  misspecification  of  density.  Here  it  should  be  noted  that 
by  "misspecification  of  density"  we  mean  that  given  an  underlying  distribution  of  the 
failure  times,  one  uses  a  different  density  to  generate  the  scores.  For  this  purpose  we 
looked  at  efficacies  when  the  underlying  density  was  logistic  and  extreme  minimum 
value.  Scores  were  also  generated  from  logistic  and  extreme  minimum  value  densities. 
Since  the  loss  in  efficiency  for  misspecification  of  densities  was  very  minimal  (most  of 
the  ARBs  were  around  90%),  we  chose  not  to  tabulate  them  here. 

The  fact  that  there  is  minimal  loss  in  efficiency  when  the  scores  were  generated 
from  a  different  density  (different  form  the  underlying  density)  is  not  surprising.  Here 
is  one  instance  where  the  distribution-free  nature  of  our  statistic  comes  into  play. 


CHAPTER  5 
MONTE  CARLO  STUDIES 

5.1  Introduction 

The  results  of  Chapters  3  and  4  enable  us  to  evaluate  the  performance  of  the  linear 
rank  tests,  T{oo,K,  p),  derived  for  the  situation  described  in  section  (2.1),  involving 
responders  and  nonresponders.  It  has  been  seen  through  Tables  (4.2),  (4.4),  and  (4.6) 
(when  the  underlying  distribution  is  logistic)  and  (4.8),  (4.10),  and  (4.12)  (when  the 
underlying  distribution  is  extreme  minimum  value)  that  there  is  loss  in  Pitman's  ARE 
when  the  shift  parameters  c  and  d  are  misspecified.  In  this  current  chapter  Monte 
Carlo  simulation  studies  are  performed  in  order  to  evaluate  the  power  of  T(oo,  K,  p) 
under  different  settings  based  on  the  amount  of  location  shift,  degree  of  censoring, 
and  sample  sizes. 

It  is  clear  that  the  test  based  on  T(oo,  K,  p)  requires  a  priori  knowledge  of  the 
value  of  p.  In  clinical  trials  one  could  use  a  p  established  from  a  previous  study.  Also, 
one  could  specify  a  desired  value  of  p  when  designing  a  clinical  trial.  In  either  case 
one  can  assume  p  to  be  prespecified.  It  is  however  possible  that  one  could  encounter 
a  situation  where  p  is  unknown.  In  the  following  sections  results  are  given  of  Monte 
Carlo  simulations  performed  for  both  cases  when  p  is  known  and  when  p  is  unknown. 

5.2    Monte  Carlo  Power  Calculations  When  p  is  Assumed  Known 

In  this  section  we  show  power  calculations  from  a  Monte  Carlo  study  when  p  is 
assumed  known.  All  simulations  were  done  on  the  Sun  Ultra,  Enterprise  450  computer 


75 


76 

using  Ox  2.0  (Doornik  1998).  Random  number  generators  in  the  Ox  2.0  environment 
were  used  to  generate  all  random  numbers. 

5.2.1    Parameter  Settings 

The  parameter  settings  considered  in  the  Monte  Carlo  simulations  are: 
:    1.  Sample  sizes  :  (300,  400). 

2.  Proportion  of  responders:  30%  (under  Hq). 

3.  Underlying  densities  :  logistic  and  extreme  minimum  value  (EMV). 

4.  Censoring  rates  :  (10%,  40%). 

5.  Censoring  distribution:  log(Uniform(0,  M)),  where  M  was  determined  using 
the  equation 

P{C<T)  =  q,  (5.1) 

where  q  =  %  censoring.  Equation  (5.1)  for  data  from  a  logistic  density  with 
location  shift  parameter  6  is  given  by 

The  corresponding  equation  when  data  is  generated  from  an  extreme  minimum 
value  density,  with  location  shift  parameter  6,  is 


6.  c  :  (2.5,  10). 

7.  d  :  (2.5,  10). 

8.  A  =  (0,  0.035667). 

9.  Number  of  runs  :  3000. 


77 


Description  of  the  Simulation  Study 

For  each  setting  of  c,  a,  and  A,  numbers  of  responders  were  generated  from  a 
binomial  distribution,  with  n  =  3000  and  p  computed  using  the  equations  (2.14)  and 
(2.15).  For  these  observations  log-failure  times  were  set  at  —100  and  the  censoring 
indicator  was  set  at  uncensored  (=1).  Failure  times  for  each  of  the  3000  repetitions 
were  generated  from  the  logistic  and  the  extreme  minimum  value  densities.  The 
different  location  shifts  for  each  group  were  added  to  each  observation.  For  each 
sample,  censoring  times  were  generated  from  a  natural  logarithm  of  a  uniform  random 
variable  with  support  (0,M),  i.e., 

L{x)  =  1  -  -^e'';  -00  <  x  <  log(M),  M  >  0. 

Values  of  M  were  computed  from  equations  (5.2)  and  (5.3)  to  give  censoring  rates 
of  10%  and  40%.  After  censoring  was  incorporated,  the  average  %  censoring  was 
computed  to  check  the  desired  level  of  censoring  over  the  3000  samples. 

Observed  g-levels 

Monte  Carlo  simulations  were  performed  to  evaluate  the  observed  a-level  for  equal 
samples  sizes  of  300  for  3000  Monte  Carlo  runs.  Power  when  A  =  0  is  an  observed 
value  of  the  Type  I  error  rate.  The  observed  a-levels  of  our  combined  test,  given  in 
equation  (2.19),  are  given  in  tables  5.1  and  5.2,  for  data  generated  from  the  logistic 
density  and  the  extreme  minimum  value  densities,  respectively.  At  each  setting  the 
observed  value  of  the  combined  test  T  was  computed  for  each  of  the  3000  runs.  Then 
the  observed  a  was  obtained  by  counting  the  number  of  observed  zs  that  exceeded 
1.96  and  those  that  were  below  -1.96.  The  final  count  was  then  divided  by  3000.  In 
these  tables  observed  a-levels  are  compared  to  a  nominal  a  =  0.0500.  The  smallest 
and  the  largest  observed  a,  in  the  Tables  5.1  and  5.2,  collectively,  are  0.0450  and 


78 


Table  5.1.  Observed  a  for  f{x)  =  Logistic,  Reps  =  3000. 


p  —  djc 

Censoring 

Peto  &  Peto 

Logrank 

(%  Group  1,  %  Group  2) 

<  -1.96 

>  +1.96 

<  -1.96 

>  +1.96 

p  =  4 

flO%  10%) 

0  0493 

0  0490 

0  0497 

0  0457 

flO%  40%) 

0.0490 

0.0453 

0.0507 

0.0450 

(40%,  10%) 

0.0477 

0.0487 

0.0490 

0.0457 

(40%,  40%) 

0.0480 

0.0490 

0.0520 

0.0490 

p  -  0.25 

(10%,  10%) 

0.0483 

0.0557 

0.0477 

0.0487 

(10%,  40%) 

0.0517 

0.0563 

0.0497 

0.0537 

(40%,  10%) 

0.0487 

0.0543 

0.0483 

0.0533 

(40%,  40%) 

0.0483 

0.0557 

0.0510 

0.0537 

Table  5.2.  Observed  a  for  f{x)  =  EMV,  Reps  =  3000. 


p  =  d/c 

Censoring 

Peto  &  Peto 

Logrank 

(%  Group  1,  %  Group  2) 

<  -1.96 

>  +1.96 

<  -1.96 

>  +1.96 

p  =  A 

(10%,  10%) 

0.0523 

0.0477 

0.0527 

0.0563 

(10%,  40%) 

0.0503 

0.0470 

0.0560 

0.0473 

(40%,  10%) 

0.0490 

0.0493 

0.0543 

0.0503 

(40%,  40%) 

0.0500 

0.0493 

0.0553 

0.0500 

p-0.25 

(10%,  10%) 

0.0563 

0.0527 

0.0570 

0.0570 

(10%,  40%) 

0.0530 

0.0500 

0.0553 

0.0520 

(40%,  10%) 

0.0520 

0.0530 

0.0540 

0.0510 

(40%,  40%) 

0.0540 

0.0500 

0.0510 

0.0483 

0.0570,  respectively.  A  95%  confidence  interval  around  0.045  and  0.0570  is  (0.0376, 
0.0524)  and  (0.0487,  0.0653)  respectively.  Both  these  confidence  intervals  contain 
0.0500,  implying  that,  with  95%  confidence,  all  observed  alpha's  shown  in  tables  5.1 
and  5.2  are  significantly  close  to  0.0500.  Hence  it  can  be  concluded  that  T,  given  in 
equation  (2.19),  is  able  to  hold  a  nominal  a-level  of  0.0500. 


Powers  under  if  ^ 


Consider  the  tables  5.3  -  5.34.  These  tables  give  empirical  powers  when  A  = 
0.025.  In  each  table  the  first  column  shows  the  labels  of  different  tests  performed. 
Specifically,  "PP(Pooled)"  stands  for  Peto  &  Peto's  (1972)  test  performed  on  the 
data  with  nonresponders  treated  as  responders  with  response  duration  equal  to  zero. 


79 


Similarly  "L (Pooled)"  is  the  corresponding  logrank  test.  The  tests  with  "Combined" 
in  parentheses  are  the  combined  tests  as  given  in  equation  (2.19).  The  last  two  rows 
in  each  table  refer  to  "Separate"  tests  for  the  proportions  and  the  failure  times.  The 
powers  for  separate  tests  were  corrected  using  Bonferroni's  inequality.  The  second 
column  in  each  table  gives  the  Monte  Carlo  power  in  the  respective  settings.  The 
column  labeled  "diif"  is  the  difference  in  power  compared  to  the  LMPT  in  each 
different  case.  For  instance,  if  data  is  generated  from  the  extreme  minimum  value 
density,  then  we  expect  the  logrank  version  of  our  combined  test  to  be  locally  most 
powerful.  Then  "diff"  is  the  difference  in  power  compared  to  the  "L(Combined)" 
test.  The  column  labeled  "SE(diff)"  is  the  upper  bound  of  the  estimate  of  the  Monte 
Carlo  error.  The  formula  used  is 


V      Number  of  runs 

Note  that  since  any  two  tests  are  positively  correlated  the  actual  standard  error  will  be 
smaller  than  those  that  have  been  displayed.  The  final  column  in  each  table  displays 
the  number  of  standard  deviations  that  separates  the  powers  of  the  other  tests  from 
the  LMPT. 

Logistic  density 

Tables  5.3-5.18  display  the  Monte  Carlo  powers  for  various  settings  of  the  param- 
eters when  data  is  generated  from  a  logistic  density.  It  is  clear  from  all  these  tables 
that  Peto  k  Peto's  (1972)  (combined)  test  has  the  highest  power.  Tables  5.3-5.10 
show  powers  when  p  =  4.  From  these  tables  we  see  that  the  powers  for  the  Peto  h 
Peto  "pooled"  test  is  not  significantly  lower  than  the  Peto  &  Peto  "combined"  test. 
The  powers  of  the  logrank(pooled)  test  is  also  significantly  lower  than  the  Peto  k 
Peto  (combined)  test.  However,  for  p  =  4,  powers  for  the  logrank  (combined)  does 
not  seem  to  be  significantly  lower  than  that  of  the  Peto  k  Peto  (combined)  test.  Also 


80 


to  be  noted  is  that  for  p  =  4  powers  do  not  drop  much  as  the  censoring  gets  heavier. 
This  is  expected  since  in  this  case  the  behavior  of  the  combined  test  T  is  dominated 
by  the  binomial  component  of  T.  Hence  a  change  in  the  censoring  pattern  may  not 
affect  the  power  drastically.  The  fact  that  there  is  no  significant  drop  in  the  power 
between  the  Pet  &  Peto  (combined)  and  the  Peto  Sz  Peto  (pooled)  is  possible  owing 
to  the  fact  that  for  p  =  4  the  test  is  dominated  by  the  binomial  component  Tg  and 
hence  in  order  to  notice  a  significant  difference  in  power  one  needs  to  look  at  a  larger 
departure  of  p  from  1. 

Tables  5.11-5.18  show  powers  when  p  =  0.25.  This  is  the  situation  when  Tp 
dominates  the  behavior  of  T,  refer  to  equation  (2.19).  Note  that  for  p  =  0.25  decUne  in 
power  of  all  other  tests,  compared  to  the  Peto  &  Peto  (combined),  is  more  pronounced 
than  when  p  =  4.  The  powers  for  the  logrank  (combined)  and  logrank  (pooled)  seem 
to  get  closer  to  that  of  the  Peto  &  Peto  (combined)  test  as  the  degree  of  censoring 
is  increased.  This  happens  for  equal  sample  sizes  of  300.  When  sample  sizes  are 
unequal,  though,  the  Peto  &:  Peto  (combined)  test  is  clearly  the  most  powerful  in 
comparison  to  any  other  pooled,  combined,  or  separate  test.  Overall  decline  in  power 
when  censoring  get  heavier  is  more  pronounced  when  p  =  0.25  in  comparison  to  that 
when  p  =  4. 

Powers  in  general  are  higher  when  p  =  0.25  than  when  p  =  4. 

In  conclusion,  for  data  following  a  logistic  distribution  the  Peto  &  Peto  (com- 
bined) test  is  consistently  more  powerful  in  comparison  to  Peto  &  Peto  (pooled), 
logrank(combined),  logrank  (pooled),  and  separate  tests.  In  some  specific  settings 
the  logrank  (combined)  and  the  Peto  &  Peto  (pooled)  may  replace  the  Peto  &  Peto 
(combined)  test  without  a  significant  loss  in  power,  as  discussed  above. 


81 


Extreme  Minimum  Value  Density 

Tables  5.19-5.34  give  powers  from  a  Monte  Carlo  simulation  with  data  generated 
from  the  extreme  minimum  value  density.  Tables  5.19-5.26  give  the  powers  when 
p  —  A.  And  tables  5.27-5.34  show  powers  when  p  —  0.25.  Tables  5.19-5.22  show 
powers  for  equal  sample  sizes,  namely  Li  =  L2  =  300,  for  censoring  patterns  being 
(10%,  10%),  (10%,  40%),  (40%,  10%),  and  (40%,  40%),  in  the  two  groups,  when 
p  =  4.  Tables  5.27-5.30  show  powers  for  the  same  set  of  pattern  of  censoring  as 
mentioned  above,  with  equal  sample  sizes,  but  now  for  p  =  0.25.  The  tables  5.23-5.26 
and  those  labeled  as  5.31-5.34  show  powers  when  sample  sizes  are  unequal,  namely 
Li  =  300  and  L2  =  400.  All  powers  are  computed  at  a  =  0.05. 

From  the  above  tables,  clearly,  neither  of  the  separate  tests  perform  well  in  every 
setting  compared  to  the  combined  version  of  the  logrank  test.  It  is  also  clear  that, 
when  p  =  4,  Peto  &  Peto  (pooled)  and  (combined)  tests  both  seem  not  to  lose 
significant  power.  Overall  powers  when  one  of  the  samples  is  of  size  400  is  higher 
than  those  for  equal  sample  sizes  of  300.  The  decline  in  separation  in  power  of  Peto 
&  Peto  (combined)  test  compared  to  the  combined  logrank  test  is  more  pronounced 
when  the  sample  sizes  are  unequal. 

When  p  =  0.25,  the  power  for  the  logrank  (combined)  test  is  highest  when  com- 
pared to  all  other  comparable  tests. 

In  conclusion,  for  data  following  an  extreme  minimum  value  distribution,  logrank 
(combined)  test  is  consistently  more  powerful  in  comparison  to  logrank  (pooled), 
Peto  k  Peto  (combined),  Peto  &  Peto  (pooled),  and  separate  tests.  In  some  specific 
settings  the  Peto  &  Peto  (combined)  test  may  replace  the  logrank  (combined)  without 
a  significant  loss  in  power. 


82 


Conclusion 

Overall,  according  to  the  Monte  Carlo  simulation  study,  the  combined  test  using 
a  prespecified  value  of  p  given  by  equation  (3.1)  is  significantly  more  powerful  than 
pooled  tests  or  separate  tests,  barring  exceptions  mentioed  above.  Since  the  signifi- 
cance is  with  reference  to  an  upper  bound  of  the  standard  error  of  the  Monte  Carlo 
powers,  in  reality  evidence  will  be  stronger  if  one  were  to  compute  the  respective 
covariances  for  comparing  the  powers.  Owing  to  time  and  resource  constraints  simu- 
lations were  performed  for  3000  Monte  Carlo  runs.  In  order  to  get  a  smaller  Monte 
Carlo  standard  error  a  larger  scale  simulation  is  required.  A  larger  scale  simulation  is 
likely  to  reveal  a  greater  separation  in  power  between  tests.  When  p  >  1  additional 
simulations,  with  p  >  4,  is  likely  to  show  a  clear  advantage  of  the  "Combined"  tests 
over  the  "Pooled"  tests. 


Table  5.3.  Power  for  p  =  4,  Ai  =  10%,  Aa  =  10%,  Li  =  300,  L2  =  300, 
Logistic. 


Powpr 

diff 

SEfdifF) 

diff/SEfdiff) 

PP  (Pooled) 

0.64800 

0.00367 

0.0123 

0.298 

PP  (Combined) 

0.65167 

L(Pooled) 

0.47167 

0.18000 

0.0126 

14.286 

L  (Combined) 

0.64333 

0.00834 

0.0123 

0.678 

Proportions 

0.51767 

0.13400 

0.0126 

10.635 

Times 

0.05200 

0.59967 

0.0096 

62.466 

Table  5.4.  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300, 
Logistic. 


Power 

diff 

SE(diff) 

difr/SE(difr) 

PP  (Pooled) 

0.64800 

0.00167 

0.0123 

0.136 

PP(Combined) 

0.64967 

L(Pooled) 

0.44900 

0.20067 

0.0126 

15.926 

L(Combined) 

0.64400 

0.00567 

0.0123 

0.461 

Proportions 

0.51767 

0.13200 

0.0126 

10.476 

Times 

0.05433 

0.59534 

0.0096 

62.014 

Table  5.5.  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300, 
Logistic. 


Power 

diff 

SE(difr) 

difr/SE(diff) 

PP(Pooled) 

0.65100 

0.00033 

0.0123 

0.027 

PP  (Combined) 

0.65133 

L  (Pooled) 

0.47767 

0.17366 

0.0126 

13.782 

L(Combined) 

0.64900 

0.00233 

0.0123 

0.189 

Proportions 

0.51767 

0.13366 

0.0126 

10.608 

Times 

0.05633 

0.59500 

0.0097 

61.340 

Table  5.6.  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300, 
Logistic. 


Power 

diff 

SE(diff) 

difr/SE(diff) 

PP(Pooled) 

0.65000 

0.00100 

0.0123 

0.081 

PP(Combined) 

0.65100 

L  (Pooled) 

0.48067 

0.17033 

0.0126 

13.518 

L  (Combined) 

0.64733 

0.00367 

0.0123 

0.298 

Proportions 

0.51767 

0.13333 

0.0126 

10.579 

Times 

0.05300 

0.59800 

0.0096 

62.292 

Table  5.7.  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  La  =  400, 
Logistic. 


Power 

din 

bti[(llU. ) 

clin/oEy(^ain ) 

PP  (Pooled) 

0.68267 

0.00309 

0.0120 

0.258 

PP  (Combined) 

0.68567 

L(Pooled) 

0.50000 

0.18576 

0.0124 

14.981 

L(Combined) 

0.68000 

0.00576 

0.0120 

0.480 

Proportions 

0.55667 

0.12909 

0.0124 

10.410 

Times 

0.06133 

0.62442 

0.0095 

65.729 

Table  5.8.  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400, 
Logistic. 


Power 

diff 

SE(diff) 

diff/SE(diflF) 

PP  (Pooled) 

0.68300 

0.00233 

0.0120 

0.194 

PP  (Combined) 

0.68533 

L(Pooled) 

0.52000 

0.16533 

0.0124 

13.333 

L(Combined) 

0.68433 

0.00100 

0.0120 

0.083 

Proportions 

0.55667 

0.12866 

0.0124 

10.376 

Times 

0.06367 

0.62166 

0.0095 

65.438 

Table  5.9.  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400, 
Logistic. 


Power 

diff 

SE(difr) 

difr/SE(difr) 

PP  (Pooled) 

0.68433 

0.00267 

0.0120 

0.222 

PP  (Combined) 

0.68700 

L(Pooled) 

0.50833 

0.17867 

0.0124 

14.409 

L  (Combined) 

0.68367 

0.00333 

0.0120 

0.278 

Proportions 

0.55667 

0.13033 

0.0124 

10.510 

Times 

0.05500 

0.63200 

0.0094 

67.234 

Table  5.10.  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400, 
Logistic. 


Power 

diff 

SE(diff) 

difr/SE(difr) 

PP  (Pooled) 
PP(Combined) 

0.68500 
0.68533 

0.00033 

0.0120 

0.0275 

L  (Pooled) 
L  (Combined) 

0.52500 
0.68433 

0.16033 
0.00100 

0.0124 
0.0120 

12.930 
0.083 

Proportions 
Times 

0.55667 
0.06267 

0.12866 
0.62266 

0.0124 
0.0095 

10.376 
65.543 

85 


Table  5.11.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic. 


Power 

aiii 

&Jl/(^ain ) 

PP(Pooled) 

0.18033 

0.24634 

0.0114 

21.372 

PP  (Combined) 

0.42667 

L(Pooled) 

0.34233 

0.08434 

0.0132 

6.389 

L(Combined) 

0.36067 

0.06600 

0.0126 

5.238 

Proportions 

0.07467 

0.35200 

0.0102 

34.510 

Times 

0.23367 

0.19400 

0.0119 

16.302 

Table  5.12.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic. 


Power 

diff 

SE(diff) 

difr/SE(diff) 

PP  (Pooled) 

0.18000 

0.24.367 

0.0114 

21.374 

PP  (Combined) 

0.42367 

L  (Pooled) 

0.33600 

0.08767 

0.0132 

60642 

L(Combined) 

0.37267 

0.05100 

0.0126 

4.048 

Proportions 

0.07467 

0.34900 

0.0102 

34.216 

Times 

0.24000 

0.18367 

0.0119 

15.434 

Table  5.13.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic. 


Power 

diff 

SE(difr) 

diff/SE(diff) 

PP(Pooled) 

0.17867 

0.24033 

0.0114 

21.082 

PP  (Combined) 

0.41900 

L  (Pooled) 

0.33800 

0.08100 

0.0132 

6.136 

L(Combined) 

0.37033 

0.04570 

0.0126 

3.627 

Proportions 

0.07467 

0.34433 

0.0102 

33.758 

Times 

0.24533 

0.17367 

0.0119 

14.594 

Table  5.14.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  =  Logistic. 


Power 

diff 

SE(difr) 

difr/SE(difr) 

PP  (Pooled) 

0.17767 

0.23733 

0.0114 

20.818 

PP  (Combined) 

0.41500 

L  (Pooled) 

0.34067 

0.07433 

0.0132 

5.631 

L(Combined) 

0.37367 

0.04133 

0.0126 

3.280 

Proportions 

0.07467 

0.34033 

0.0102 

33.366 

Times 

0.24133 

0.17367 

0.0119 

14.594 

Table  5.15.   Power  for  p  =  0.25,  Ai  =  10%,  M  =  10%,  Li  =  300,  L2  =  400, 

f{x)  =  Logistic. 


Power 

ain 

PP  (Pooled) 

0.20333 

0.27867 

0.0117 

23.818 

PP  (Combined) 

0.48200 

L  (Pooled) 

0.36567 

0.11633 

0.0127 

9.160 

L  (Combined) 

0.39900 

0.08300 

0.0128 

6.484 

Proportions 

0.07700 

0.40500 

0.0103 

39.320 

Times 

0.25633 

0.22567 

0.0121 

18.650 

Table  5.16.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic. 


Power 

diff 

SE(difr) 

difr/SE(difr) 

PP  (Pooled) 

0.20300 

0.27100 

0.0117 

23.162 

PP  (Combined) 

0.47400 

L(Pooled) 

0.37633 

0.09767 

0.0127 

7.690 

L  (Combined) 

0.40667 

0.06733 

0.0128 

5.260 

Proportions 

0.07700 

0.39700 

0.0103 

38.544 

Times 

0.26267 

0.21133 

0.0121 

17.465 

Table  5.17.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic. 


Power 

diff 

SE(diff) 

difr/SE(diff) 

PP  (Pooled) 

0.20133 

0.27434 

0.0117 

23.448 

PP  (Combined) 

0.47567 

L(Pooled) 

0.38300 

0.09267 

0.0127 

7.297 

L(Combined) 

0.42967 

0.04600 

0.0128 

3.594 

Proportions 

0.07700 

0.39867 

0.0103 

38.706 

Times 

0.27467 

0.20100 

0.0121 

16.612 

Table  5.18.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  =  Logistic. 


Power 

diff 

SE(difr) 

difr/SE(difr) 

PP(Pooled) 

0.20167 

0.26833 

0.0117 

22.934 

PP(Combined) 

0.47000 

L  (Pooled) 

0.37033 

0.09967 

0.0127 

7.848 

L(Combined) 

0.42133 

0.04867 

0.0128 

3.802 

Proportions 

0.07700 

0.39300 

0.0103 

38.155 

Times 

0.28167 

0.18833 

0.0121 

15.564 

87 


Table  5.19.  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  300,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(diff) 

PP  (Pooled) 

0.65933 

0.00700 

0.0122 

0.574 

PP  (Combined) 

0.65933 

0.00700 

0.0122 

0.574 

L(Pooled) 

0.55100 

0.11533 

0.0125 

9.226 

L  (Combined) 

0.66633 

Proportions 

0.51767 

0.14866 

0.0125 

11.893 

Times 

0.09000 

0.57633 

0.0101 

57.062 

Table  5.20.  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(difr) 

PP  (Pooled) 

0.65667 

0.00933 

0.0122 

0.765 

PP  (Combined) 

0.66000 

0.00600 

0.0122 

0.492 

L  (Pooled) 

0.55600 

0.11000 

0.0125 

8.800 

L(Combined) 

0.66600 

Proportions 

0.51767 

0.14833 

0.0125 

11.866 

Times 

0.08700 

0.57900 

0.0100 

57.900 

Table  5.21.  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  U  =  300,  L2  =  300,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(difr) 

PP  (Pooled) 

0.66067 

0.00333 

0.0122 

0.273 

PP  (Combined) 

0.65767 

0.00633 

0.0122 

0.519 

L  (Pooled) 

0.54900 

0.11500 

0.0125 

9.200 

L(Combined) 

0.66400 

Proportions 

0.51767 

0.14633 

0.0125 

11.706 

Times 

0.08500 

0.57900 

0.0100 

57.900 

Table  5.22.  Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(difr) 

PP  (Pooled) 

0.66033 

0.00267 

0.0122 

0.219 

PP  (Combined) 

0.65767 

0.00533 

0.0122 

0.437 

L(Pooled) 

0.55033 

0.11267 

0.0125 

9.014 

L(Combined) 

0.66300 

Proportions 

0.51767 

0.14533 

0.0125 

11.626 

Times 

0.08167 

0.58133 

0.0100 

58.133 

i 


88 


Table  5.23.  Power  for  p  =  4,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(diff) 

PP(Pooled) 

0.69167 

0.01233 

0.0118 

1.045 

PP  (Combined) 

0.69200 

0.01200 

0.0118 

1.017 

L(Pooled) 

0.57767 

0.12633 

0.0123 

10.271 

L(Combined) 

0.70400 

Proportions 

0.55667 

0.14733 

0.0123 

11.978 

Times 

0.09433 

0.60967 

0.0099 

61.583 

Table  5.24.  Power  for  p  =  4,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x)  -  EMV. 


Power 

diff 

SE(difr) 

diff/SE(diff) 

PP  (Pooled) 

0.69133 

0.01067 

0.0118 

0.904 

PP  (Combined) 

0.69233 

0.00967 

0.0118 

0.819 

L(Pooled) 

0.58000 

0.12200 

0.0123 

9.919 

L(Combined) 

0.70200 

Proportions 

0.55667 

0.14533 

0.0123 

11.815 

Times 

0.09433 

0.60767 

0.0099 

61.381 

Table  5.25.  Power  for  p  =  4,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400,  f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(diff) 

PP  (Pooled) 

0.69033 

0.00770 

0.0118 

0.652 

PP  (Combined) 

0.69233 

0.00867 

0.0118 

0.735 

L  (Pooled) 

0.58088 

0.12012 

0.0123 

9.766 

L(Combined) 

0.70100 

Proportions 

0.55667 

0.14433 

0.0123 

11.734 

Times 

0.09067 

0.61033 

0.0099 

61.649 

Power  for  p  =  4,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400,  f{x 

Power 

diff 

SE(diff) 

diff/SE(diff) 

PP(Pooled) 

0.69100 

0.00900 

0.0118 

0.763 

PP(Combined) 

0.69233 

0.00767 

0.0118 

0.650 

L  (Pooled) 

0.58400 

0.11600 

0.0123 

9.431 

L  (Combined) 

0.70000 

Proportions 

0.55667 

0.14330 

0.0123 

11.650 

Times 

0.09100 

0.60900 

0.0099 

61.515 

Table  5.27.  Power  for  p  =  0.25,  Ai  =  10%,  As  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  EMV. 


1  (JWcl 

HifF 

Cllil 

PP  (Pooled) 

0.22000 

0.54967 

0.0108 

50.895 

PP  (Combined) 

0.66933 

0.10034 

0.0115 

8.725 

L(Pooled) 

0.66000 

0.10967 

0.0116 

9.454 

L(Combined) 

0.76967 

Proportions 

0.07467 

0.69500 

0.0091 

76.374 

Times 

0.63500 

0.13467 

0.0117 

11.51 

Table  5.28.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  =  EMV. 


Power 

diff 

SE(difr) 

difr/SE(difr) 

PP  (Pooled) 

0.21767 

0.52766 

0.0109 

48.409 

PP  (Combined) 

0.66033 

0.08500 

0.0117 

7.265 

L(Pooled) 

0.64833 

0.09700 

0.0118 

8.220 

L  (Combined) 

0.74533 

Proportions 

0.07467 

0.67066 

0.0093 

72.114 

Times 

0.61533 

0.13000 

0.0119 

10.924 

Table  5.29.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  10%,  Li  =  300,  L2  =  300, 
f{x)  =  EMV. 


Power 

diff 

SE(difr) 

diff/SE(diff) 

PP(Pooled) 

0.21800 

0.52067 

0.0110 

47.334 

PP  (Combined) 

0.66133 

0.07734 

0.0118 

6.554 

L  (Pooled) 

0.63267 

0.10600 

0.0119 

8.908 

L(Combined) 

0.73867 

Proportions 

0.07467 

0.66400 

0.0093 

71.398 

Times 

0.60567 

0.13300 

0.0120 

11.083 

Table  5.30.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  300, 
f{x)  =  EMV. 


Power 

diff 

SE(diff) 

d/SE(difr) 

PP  (Pooled) 

0.21700 

0.50835 

0.0111 

45.797 

PP  (Combined) 

0.65533 

0.07002 

0.0119 

5.884 

L  (Pooled) 

0.62467 

0.10068 

0.0120 

8.390 

L  (Combined) 

0.72535 

Proportions 

0.07467 

0.65068 

0.0094 

69.221 

Times 

0.59300 

0.13235 

0.0121 

10.938 

Table  5.31.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  10%,  Li  =  300,  L2  =  400, 
fix)  =  EMV. 


Power 

diff 

SE(difF) 

diff/SE(diff) 

PP  (Pooled) 

0.24500 

0.57167 

0.0106 

53.931 

PP  (Combined) 

0.72100 

0.09567 

0.0108 

8.858 

L(Pooled) 

0.71200 

0.10467 

0.0109 

9.603 

L  (Combined) 

0.81667 

Proportions 

0.07700 

0.73967 

0.0086 

86.008 

Times 

0.69833 

0.11834 

0.0110 

10.758 

Table  5.32.  Power  for  p  =  0.25,  Ai  =  10%,  A2  =  40%,  Li  =  300,  L2  =  400, 
f{x)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(difr) 

PP  (Pooled) 

0.24400 

0.55633 

0.0107 

51.993 

PP(Combined) 

0.71267 

0.08766 

0.0110 

7.969 

L(Pooled) 

0.70100 

0.09933 

0.0111 

8.949 

L(Combined) 

0.80033 

Proportions 

0.07700 

0.72333 

0.0088 

82.196 

Times 

0.68233 

0.11800 

0.0112 

10.536 

Table  5.33.  Power  for  p  =  0.25,  Aj  =  40%,  A2  =  10%,  Li  =  300,  L2  =  400, 
fix)  =  EMV. 


Power 

diff 

SE(difr) 

difr/SE(diflF) 

PP  (Pooled) 

0.24333 

0.54534 

0.0108 

50.494 

PP(Combined) 

0.70833 

0.08034 

0.0112 

7.173 

L(Pooled) 

0.68167 

0.10700 

0.0113 

9.469 

L  (Combined) 

0.78867 

Proportions 

0.07700 

0.71167 

0.0090 

79.074 

Times 

0.66167 

0.12700 

0.0114 

11.140 

Table  5.34.  Power  for  p  =  0.25,  Ai  =  40%,  A2  =  40%,  Li  =  300,  L2  =  400, 
fix)  =  EMV. 


Power 

diff 

SE(diff) 

diff/SE(diff) 

PP(Pooled) 

0.24167 

0.53700 

0.0109 

49.266 

PP  (Combined) 

0.70233 

0.07634 

0.0113 

6.756 

L(Pooled) 

0.67333 

0.10534 

0.0114 

9.240 

L(Combined) 

0.77867 

Proportions 

0.07700 

0.70167 

0.0090 

77.963 

Times 

0.65000 

0.12867 

0.0115 

11.189 

91 


5.3   Monte  Carlo  Power  Calculations  When  p  is  Unknown 

In  this  section  we  will  address  the  situation  when  p  is  unknown.  For  demonstrative 
purposes  Monte  Carlo  powers  for  data  from  extreme  minimum  value  density  will  be 
shown. 

5.3.1    Parameter  Settings 

The  parameter  settings  considered  in  the  Monte  Carlo  simulations  are: 

1.  Sample  sizes  :  500  (each  sample). 

2.  Proportion  of  responders  :  30%  (under  Hq). 

3.  Underlying  densities  :  extreme  minimum  value  (EMV). 
.    4.  Censoring  rates  :  (40%,  40%). 

5.  Censoring  was  incorporated  in  the  data  using  the  exact  scheme  as  discussed  in 
section  5.2. 

6.  c  :  (2.5,  10). 

7.  d  :  (2.5,  10). 

8.  For  p  =  0.25 

A  =  (0,0.028768,0.035667). 

For  p  =  4 

A  =  (0,0.025,0.035). 

Note  that  A  =  0.028768  and  A  =  0.035667  correspond  to  25%  and  30%  location 
shift  in  the  failure  time  distributions  between  the  two  samples,  respectively. 

9.  Number  of  runs  :  3000. 


92 


Description  of  the  Simulation  Study 

The  simulation  study  for  this  section  was  done  along  the  lines  of  those  when  p 
is  known.  However,  in  this  section  p  was  estimated.  Estimation  of  p  amounts  to 
estimating  cA  and  dA.  Samples  were  generated  from  the  exponential  density  with 
hazard  rate  e"*^^.  Under  the  proportional  hazards  assumption  the  location  shift  cA 
was  estimated  by  estimating  the  parameter  ^  through  maximizing  the  likelihood 
function  given  by  equation  (4.6),  page  73,  Kalbfleisch  &  Prentice  (1980). 

For  each  of  the  3000  samples  cA  was  estimated  from  the  proportional  hazards 
model  and 

2A  =  iogMi::M. 

Pl(l  -P2) 

Hence 

.  5a 
'  =  ^- 

Tables  5.35  and  5.36  show  Monte  Carlo  powers  computed  for  the  logrank  (pooled), 
logrank  (combined,  assuming  p  known),  logrank  (combined,  using  p),  proportions 
(separate),  and  logrank  (separate).  The  nominal  a  for  the  "Separate"  was  fixed  at 
0.025,  using  Bonferroni's  correction.  The  observed  a  levels  for  logrank  (combined, 
using  p)  was  about  0.0625,  which  indicated  that  these  tests  must  be  run  at  a  =  0.04 
instead  of  at  a  =  0.05.  All  powers  for  logrank  (combined,  using  p)  shown  in  tables 
5.35  and  5.36  are  computed  at  a  =  0.04. 

From  tables  5.35  and  5.36  it  is  clear  that  logrank  (combined,  p  known)  is  the  most 
powerful.  Moreover,  the  powers  for  logrank  (combined,  using  p),  albeit  lower  than 
those  for  logrank  (combined,  p  known),  are  higher  than  those  for  the  logrank  (pooled) 
and  the  "Separate"  tests. 

In  conclusion,  if  p  is  unknown  then  estimating  p  from  the  data  is  reasonable.  Of 
course  caution  must  be  exercised  when  estimating  p  in  order  to  maintain  high  power. 


93 


Table  5.35.  Empirical  powers  for  p  =  4  (upper  one-sided)  when  p  is  estimated. 


Statistics 

A  =  0.0000 

A  =  0.0250 

A  =  0.0350 

Logrank  (Pooled) 

0.05370 

0.58533 

0.73933 

Logrank  (Combined,  p) 
Logrank  (Combined,  p) 

0.05470 
0.05270 

0.68533 
0.66733 

0.84067 
0.81767 

Proportions  (Separate) 
Times  (Separate) 

0.02370 
0.02970 

0.53867 
0.07100 

0.73600 
0.09433 

Table  5.36.  Empirical  powers  for  p  =  0.25  (upper  one-sided)  when  p  is  estimated. 


Statistics 

A  -  0.0000 

A  =  0.0288 

A  =  0.0357 

Logrank  (Pooled) 

0.05370 

0.48533 

0.62433 

Logrank  (Combined,  p) 
Logrank  (Combined,  p) 

0.05470 
0.05270 

0.59800 
0.56400 

0.74900 
0.70767 

Proportions  (Separate) 
Times  (Separate) 

0.02370 
0.02970 

0.07700 
0.45300 

0.09367 
0.61167 

5.4    Estimation  of  Location-Shift 


Owing  to  the  utmost  importance  in  estimating  the  location-shift  parameter  in  our 
context,  some  methods  are  discussed.  In  general  there  are  several  non-parametric 
methods  for  estimating  the  location  shift  cA.  If  one  prefers  not  to  make  many  as- 
sumptions about  the  distributions  of  the  true  survival  times,  and  if  censoring  is  not 
present,  one  of  the  most  widely  used  robust  methods  of  analysis  is  the  Hodges  &: 
Lehmann  (1963)  approach.  Bassiakos  et  al.  (1991)  proposed  an  adaptation  of  the 
Hodges-Lehmann  shift  estimator  to  take  into  account  censoring.  Lai  &  Ying  (1991) 
establish  the  large-sample  properties  of  a  modified  Buckley-James  (1979)  estimator 
for  the  linear  regression  model.  Tsiatis  (1990)  uses  linear  rank  tests,  based  on  Pren- 
tice (1978),  to  estimate  the  regression  parameters.  The  equivalence  between  these 
two  estimators  is  presented  in  Ritov  (1990). 


CHAPTER  6 
EXAMPLE 


Throughout  this  dissertation  we  have  so  far  proposed  a  test  statistic  (given  by 
equation  (2.19))  that  addresses  a  certain  structure  of  hypothesis  (given  in  equations 
(2.2)  and  (2.2))  under  a  given  premise  (as  given  in  section  2.1).  Subsequently  in 
Chapter  3  we  established  the  asymptotic  null  distribution  of  our  statistic  T.  In 
Chapter  4  we  derived  the  efficacy  formula  for  the  statistic  T.  We  also  showed  some 
ARE  results.  Chapter  5  reported  Monte  Carlo  power  calculations,  which  established 
the  strength  of  the  combined  test  statistic  T  under  various  settings.  In  this  chapter 
we  will  demonstrate  the  applicability  of  our  test  statistic  T  with  a  dataset. 

CERAD  (Consortium  to  Establish  a  Registry  for  Alzheimer's  Disease),  funded  by 
the  U.S.  National  Institute  of  Aging  (NIA,  grant  #  AG06790)  in  1986,  developed  a 
battery  of  standardized  assessments  for  the  evaluation  of  patients  with  Alzheimer's 
Disease  (AD).  Instruments  for  the  clinical,  neuropsychological,  and  neuropathological 
assessment  of  dementia  were  administered  to  obtain  longitudinal  data  on  subjects 
enrolled  at  twenty  four  university  medical  centers.  Patients  and  control  subjects  were 
evaluated  at  entry  and  annually  thereafter,  to  track  the  natural  progression  of  AD. 

Table  6.1  shows  a  summary  of  the  structure  of  the  CERAD  data.  A  total  of  906 
AD  patients  were  considered  in  our  analysis.  Out  of  364  males  and  542  females,  181 
males  and  234  females  were  institutionalized.  Hence  the  proportions  institutional- 
ized were  49.73%  for  males  and  43.17%  for  females.  The  median  survival  times  after 


94 


95 


instutionalization  was  1  year  for  males  and  2.1  years  for  females.  For  those  institu- 
tionalized, the  percentage  of  censoring,  with  respect  to  time  to  death,  was  17.58%  for 
males  and  23.62%  for  females. 

In  order  to  perform  our  test  based  on  the  statistic  T,  we  need  an  estimate  of 
p  =  d/c.  Hence  it  is  sufficient  to  estimate  cA  and  dA.  From  equation  (3.5)  it  is  clear 
that 

3A  =  log?i<'-^' 


For  the  CERAD  data 


P2(l  -Pl)' 


dA  =  0.26379. 


One  could  regard  the  estimation  of  cA  as  a  problem  of  estimating  the  slope  parameter, 
in  an  accelated  failure  time  model.  In  the  analysis  of  the  CERAD  data,  Buckley 
&  James  (1979)  iterative  procedure  for  estimating  a  location  shift  in  censored  data 
was  used.  The  Buckley-James  procedure  yielded 

cA  =  0.48671. 

Note  that  since  we  would  be  interested  in  testing  for  an  increase  in  median  survival 
times  and  a  decrease  in  proportions  who  get  institutionalized,  for  a  hypothetical 
treatment  group,  we  are  confronted  with  a  situation  which  is  bidirectional  in  nature. 
The  alternative  hypothesis  is  given  by  equation  (3.5).  Hence  the  structure  of  the 
appropriate  test  statistic  in  this  case  is  given  by  equation  (3.6).  As  per  discussion  in 
section  3.2.2  we  note  that  in  this  case  pooled  tests  in  this  case  must  be  performed 
such  that  those  not  institutionalized  are  assumed  to  be  institutionalized  at  time  oo, 
uncensored.  Table  6.2  gives  the  observed  values  of  the  Z  statistic  for  the  different 
tests.  Observe  that  Peto  &  Peto's  (1972)  "Separate"  test  has  a  Zobs  higher  than  the 
Zobs  of  the  logrank  test.  However,  since  the  observed  values  of  the  Z  statistic  are  quite 
big,  we  have  chosen  not  to  show  the  respective  p  -  ua/wes.  Neither  of  the  pooled  tests 


96 


Table  6.1.  Data:  CERAD 


Men 

Women 

Sample  size 

364 

542 

Number  institutionalized 
(proportions) 

181 
(0.4973) 

234 
(0.4317) 

Median  Survival  (years) 
Censoring  (%) 

1 

17.58 

2.1 
23.62 

Table  6.2.  Analysis:  CERAD 


Tests 

Zobs 

Peto  &  Peto  (pooled) 

4.9251 

Peto  &  Peto  (combined) 

5.4016 

Logrank  (pooled) 

4.6617 

Logrank  (combined) 

5.1630 

Peto  &Peto  (separate) 

5.2439 

Logrank  (separate) 

4.7751 

Proportions  (separate) 

1.9410 

have  a  strong  Zgbs-  Peto  &  Peto  (combined)  is  the  strongest  test  {Zobs  =  5.4016), 
which  is  stronger  than  the  Peto  &  Peto  (separate)  {Zobs  —  5.2439). 

In  conclusion,  Peto  &  Peto  (combined),  being  the  most  appropriate  test  to  perform 
for  the  CERAD  data,  does  yield  the  highest  observed  value  of  the  Z  statistic. 

Demonstration  of  applicability  of  the  combined  test  proposed  in  this  dissertation 
on  CERAD  data  is  now  complete. 


CHAPTER  7 
SUMMARY 


In  clinical  trials  when  one  is  confronted  with  situations  that  have  the  same  struc- 
ture as  our  scenario,  one  might  consider  using  our  combined  test  statistic  T  given  in 
equation  (2.19),  instead  of  performing  separate  tests  or  pooling  nonresponders  with 
responders.  Our  test  statistic  T  requires  the  user  to  have  prior  knowledge  of  p  =  d/c. 
If  p  must  be  estimated  then  it  is  crucial  that  one  estimates  p  accurately  in  the  interest 
of  higher  power. 

A  pitfall  in  using  our  test  T  is  when  change  in  responder  proportions  and  the 
change  in  median  failure  times  are  both  close  to  zero.  In  such  situations  estimation 
of  p  may  not  be  reliable,  owing  to  instability  in  a  near- (0/0)  form. 

Pooled  tests  will  be  identical  to  combined  tests  only  when  p  =  1.  In  all  other 
situations  one  can  expect  our  combined  test  T  to  have  higher  power. 

Our  combined  test  is  simple  in  structure.  Moreover,  T  has  the  capability  of 
handling  both  unidirectional  and  bidirectional  hypotheses.  In  other  words,  even  if 
the  sign  of  the  two  components  in  T  are  opposite,  the  eflFects  do  not  cancel  each  other 
out. 


97 


REFERENCES 


Andersen,  P.  K.,  Borgan,  O.,  Gill,  R.,  &  Keiding,  N.  (1982).  Linear  nonparametric  tests 
for  comparison  of  counting  processes,  with  applications  to  censored  survival  data  (with 
discussion).  International  Statistical  Review  50,  219-258. 

Andersen,  R  K.,  Borgan,  O.,  Gill,  R.,  &  Keiding,  N.  (1993).  Statistical  Models  Based  on 
Counting  Processes.  New  York:  Springer  Verlag. 

Bassiakos,  Y.  C.,  Meng,  X.  L.,  k  Lo,  S.  H.  (1991).  A  general  estimator  of  the  treatment 
effect  when  the  data  are  heavily  censored.  Biometrika  78,  741-748. 

Birnbaum,  A.,  k  Laska,  E.  (1967).  Efficiency  robust  two-sample  rank  tests.  Journal  of  the 
American  Statistical  Association  62,  1241-1251. 

Buckley,  J.,  k  James,  I.  (1979).  Linear  regression  with  censored  data.  Biometrika  66, 
429-436. 

Chang,  M.  N.,  Guess,  H.  A.,  k  Heyse,  J.  F.  (1994).  Reduction  of  burden  of  illness:  A  new 
efficacy  measure  for  prevention  trials.  Statistics  In  Medicine  IS,  1807-1814. 

Chung,  K.  L.  (1974).  A  Course  in  Probability  Theory.  New  York:  Acaxlemic  Press. 

Doornik,  J.  A.  (1998).  Ox:  An  Object-Oriented  Matrix  Language.  London:  Timberlake 
Consultants. 

Fleming,  T.  R.,  k  Harrington,  D.  P.  (1991).  Counting  Processes  and  Survival  Analysis. 
New  York:  John  Wiley. 

Gastwirth,  J.  L.  (1970).  Nonparametric  Techniques  in  Statistical  Inference.  Chambridge: 
Chambridge  University  Press. 

Gill,  R.  D.  (1980).  Censoring  and  Stochastic  Integrals.  Amsterdam:  Mathematical  Centre 
Tracts  124. 

Harrington,  D.  P.,  k  Fleming,  T.  R.  (1982).  A  class  of  rank  test  procedures  for  censored 
survival  data.  Biometrika  69,  553-566. 

Helland,  I.  S.  (1982).  Central  limit  theorems  for  martingales  with  discrete  or  continuous 
time.  Scandinavian  Journal  of  Statistics  9,  79-94. 

Hodges,  J.  L.,  k  Lehmann,  E.  L.  (1963).  Estimates  of  location  based  on  rank  tests.  Annals 
of  Mathematical  Statistics  34,  598-611. 


98 


99 


Johnson,  R.  A.,  Verrill,  S.,  &  Moore,  Dan  H.,  I.  (1987).  Two-sample  rank  tests  for  detecting 
changes  that  occur  in  a  small  proportion  of  the  treated  population.  Biometrics  43  641- 
655. 

Kalbfleisch,  J.  D.,  k  Prentice,  R.  L.  (1980).  The  Statistical  Analysis  of  Failure  Time  Data. 
New  York:  John  Wiley. 

Kaplan,  E.  L.,  &  Meier,  R  (1958).  Nonparametric  estimation  from  incomplete  observations. 
Joural  of  the  American  Statistical  Association  53,  457-481. 

Lai,  T.  L.,  &:  Ying,  Z.  (1991).  Large  sample  theory  of  a  modified  buckley-james  estimator 
in  regression  analysis  with  censored  data.  Annals  of  Statistics  19,  1370-1402. 

Lee,  E.  T.,  Desu,  M.  M.,  &  Gehan,  E.  A.  (1975).  A  Monte  Carlo  study  of  the  power  of 
some  two-sample  tests.  Biometrika  62,  425-432. 

Leurgans,  S.  (1983).  Three  classes  of  censored  data  rank  tests:  Strengths  and  weaknesses 
under  censoring.  Biometrika  70,  651-658. 

Mehrotra,  K.  G.,  Michalek,  J.  E.,  &  Mihalko,  D.  (1982).  A  relationship  between  two  forms 
of  linear  rank  procedures  for  censored  data.  Biometrika  69,  674-676. 

Peto,  R.,  &  Peto,  J.  (1972).  Asymptotically  efficient  rank  invariant  test  procedures  (with 
discussion).  Joural  of  the  Royal  Statistical  Society,  series- A  135,  185-206. 

Prentice,  R.  L.  (1973).  Exponential  survivals  with  censoring  and  explanatory  variables. 
Biometrika  60,  279-288. 

Prentice,  R.  L.  (1978).  Linear  rank  tests  with  right  censored  data.  Biometrika  65,  167-180. 

Prentice,  R.  L.,  &  Marek,  P.  (1979).  Qualitative  discrepancy  between  censored  data  rank 
tests.  Biometrics  35,  861-867. 

Randies,  R.  H.,  &:  Wolfe,  D.  A.  (1991).  Introduction  to  the  Theory  of  Nonparametric 
Statistics.  Malabar,  FL:  Krieger. 

Ritov,  Y.  (1990).  Estimation  in  the  linear  regression  model  with  censored  data.  Annals  of 
Statistics  18,  303-328. 

Savage,  I.  R.  (1956).  Contribution  to  the  theory  of  rank  order  statistics  -  the  two-sample 
case.  Annals  of  Mathematical  Statistics  27,  590-615. 

Thisted,  R.  A.  (1988).  Elemants  of  Statistical  Computing.  New  York:  Chapman  k  Hall. 

Tsiatis,  A.  A.  (1990).  Estimating  regression  parameters  using  linear  rank  tests  for  censored 
data.  Annals  of  Statistics  18,  354-372. 


BIOGRAPHICAL  SKETCH 


I  was  born  Rahul  Mukherjee  in  Calcutta,  India,  on  16  June  1963.  Around  4-5  years 
of  age,  I  had  the  strange  notion  that  my  name  had  something  to  do  with  my  feeling 
confused  in  life.  During  a  kindergarten  class  I  pointed  to  the  letter  "R"  successfully. 
Thrilled,  I  started  calling  myself  Robin. 

My  family  moved  around  the  country  frequently,  and  it  was  hard  for  me  to  es- 
tablish a  sense  of  belonging  in  any  particular  place.  From  1972  to  1978  I  attended 
boarding  school.  I  completed  high  school  in  Calcutta  in  1980. 

My  interest  in  scholastics  did  not  emerge  until  late  in  my  high  school  years,  when 
I  was  fortunate  enough  to  study  with  a  wonderful  man  by  the  name  of  Mrinal  Kanti 
Basu,  my  private  tutor.  Mr.  Basu's  methods  were  quite  nonstandard.  He  did  not 
teach  me  mathematics;  actually  he  taught  me  about  love,  respect,  learning,  and 
perseverance.  He  would  always  say  to  me,  "Robin,  if  He  places  a  hurdle  before  you, 
and  He  always  does.  He  also  hides  a  solution  that  you  need  to  find."  I  wish  Mr.  Basu 
were  still  among  us  today,  as  I  owe  him  my  subsequent  academic  success. 

After  graduating  from  high  school  with  top  scores  in  math  and  science  I  went 
to  St.  Xavier's  College,  Calcutta,  to  study  economics,  mathematics,  and  statistics 
at  the  pre-university  level.  I  continued  at  St.  Xavier's  College  to  earn  a  B.Sc.  in 
Mathematics  in  1988. 

In  1989  I  joined  the  M.S.  program  at  the  University  of  Southern  Mississippi, 
Hattiesburg,  MS,  earning  a  M.S.  in  Mathematics  in  1991.  I  then  enrolled  in  Virginia 
Polytechnic  Institute  k  State  University,  Blacksburg,  VA  where  I  received  a  M.S.  in 
statistics  in  1993.  Fall  1993  brought  me  to  the  University  of  Florida  in  quest  of  a 
Ph.D.  in  statistics.  The  quest  has  finally  been  fulfilled. 


100 


Starting  in  March  1999  I  will  work  as  a  Biometrician  in  the  CBARDS  (Clinical 
Biostatistics  and  Research  Data  Systems)  department  of  Merck  Research  Laborato- 
ries, Blue  Bell,  PA. 


101 


I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Myron  N.  Chang,  ChairmAn 
Professor  of  Statistics  ^ 

I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Ronald  H.  Randies 
Professor  of  Statistics 

I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Pejaver  V.  Rao 
Professor  of  Statistics 

I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Philosophy. 


^^^^^^  ^1 


Jonathan  J.  Shuster 
Professor  of  Statistics 

I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Philosophy. 


i^es  L.  Kepner 


5 

Professor  of  Statistics 

I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a  dissertation  for  the  degree  of  Doctor  of  Pmlosc4>hy. 


Chunrong  Ai 

Associate  Professor  of  Economics 


This  dissertation  was  submitted  to  tiie  Graduate  Faculty  of  tlie  Department  of 
Statistics  in  the  College  of  Liberal  Arts  and  Sciences  and  to  the  Graduate  School  and 
was  accepted  as  partial  fulfillment  of  the  requirements  for  the  degree  of  Doctor  of 
Philosophy. 


May  1999 


Dean,  Graduate  School 


